From patchwork Mon Feb 24 12:13:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Donnefort X-Patchwork-Id: 13988018 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B625CC021A4 for ; Mon, 24 Feb 2025 13:13:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=jWYLW84fII6ZnL9QJz4u7J1y34KkAmylrxqKmP4x71E=; b=UY1EcRSW/sAgvAPAF9/30I8uwa MzMmYHA4V9e3x2bGM6J7p+R/vJEgTuyV7Kq4FsrrOIAxZHnN8QuRwlOkAkiCLKVmC5EKY2a+Php7c Yu3zShVrapWHlBej0grdDszh5x0Q1jtz5CNY/ryAy9tcXqrR5PD9C3oePQdq/ocPKIehOLT+tNQ5P APVDEUjCY6oDur0ToK6T3JqfNVGWE45pUDo4G/PVciyLzGhcoKBcJTTG+N5snc7a0VXRKa5tVxVx2 neNzAXL7aNNIqVcJOFHui/ZVO1mTyzEep//NY7ZX+8Uz5cz9WBi6j3wpG/yfYneQo3g7LoORHN/g1 6er5AmoA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmYHO-0000000Dn5A-03ZG; Mon, 24 Feb 2025 13:13:50 +0000 Received: from mail-wr1-x44a.google.com ([2a00:1450:4864:20::44a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmXMu-0000000DeIT-0f3v for linux-arm-kernel@lists.infradead.org; Mon, 24 Feb 2025 12:15:30 +0000 Received: by mail-wr1-x44a.google.com with SMTP id ffacd0b85a97d-38f27acb979so4400846f8f.0 for ; Mon, 24 Feb 2025 04:15:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740399326; x=1741004126; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=jWYLW84fII6ZnL9QJz4u7J1y34KkAmylrxqKmP4x71E=; b=EEQatXMtIylyRahfbKn9yt+ksKcFAJV+gLlCAdjgpoU1gg88SHhMWjxS1xdAuHeGOj 0vJkqtc/5vRwpx0W/eLQcX2L4RQBlclj4n+lior8TsEYvr14CvSsXTizY96mki1kqO6M n3f5+CgpPG/oxcAZ61Cxlgz1UF3LjzferomTBfTR5tA34Y7wzwfCpFD7Tyqxwr5/rYW9 OKLaZGJgxYS7p+GCX+OUYnQoyoUh0LH53+X/wS4C4ciSbvSs7TLqpV4AE9BYJh0JTNva PVI8ap0rIWNymyy/lI0Hty9pjpNuhwwjFhknG9EYZILzIPm/641UbWS5+0PEumKufb7g 0Ziw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740399326; x=1741004126; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jWYLW84fII6ZnL9QJz4u7J1y34KkAmylrxqKmP4x71E=; b=ZevCAXWC2qbKU2uz+wVbFHALJjm2rhKY1itmsizfn2lcfY/SE4gTAhPxsqaa87VPK/ KEKzZp1jTZcuN4Sa730lIL63IIwcMnUPmKcdHtOBcp2SHakI7LSnLTuQkuZtW2FHPNVv 4UMyPhFIFRb75J3N2nm1qorgLEJ0YyoCy+hlRhpsa+VjogDtCzEbRLjljBD9jdX39mAK FRi0OJlWZ23Koai2YCP5QV2syo+djsIE923geNEp4IS9jz4rBWx29LDJEChntRMyowon Hn7IBcF7pp436LLvdH3qV/bZoezRbLpLj22ZKrfULs3Vwbb1BZlwnpYPRuCK8bVAnXrY enFg== X-Forwarded-Encrypted: i=1; AJvYcCUZDsh4O9iaMfC91arJa2O2sZ5svITdmsMine4UowgeyqVRVkjbkuWwZRSA6ldmNWONmQQkazLyO44+QpzefM8e@lists.infradead.org X-Gm-Message-State: AOJu0YzIKQ+uQ0EBZxt801Lh6fJvge6/KGOW+4usaTqydHKdLZ1D3mQt ya/2S5lEnNJNsv5gSUmkKt8dJhjcp7C45lLktO+1ZlbpkgcwWvYUNvqvfiluNmVsdJ2hik5CaQ9 HwNZMt9gSDUSlVTR/3Q== X-Google-Smtp-Source: AGHT+IHyObP309uBRXnN5wwj7NDZKKEgN2u3PMNug3aC5ywwpb/Sai0pnVQ5x/varO7eM7fJIU1wJnfYbD7A4Yfr X-Received: from wmgg25.prod.google.com ([2002:a05:600d:19:b0:439:997a:8b94]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:6c6d:0:b0:38f:4cdc:5d2a with SMTP id ffacd0b85a97d-38f70783feamr9139552f8f.4.1740399326119; Mon, 24 Feb 2025 04:15:26 -0800 (PST) Date: Mon, 24 Feb 2025 12:13:43 +0000 In-Reply-To: <20250224121353.98697-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20250224121353.98697-1-vdonnefort@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250224121353.98697-2-vdonnefort@google.com> Subject: [PATCH 01/11] ring-buffer: Introduce ring-buffer remote From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_041528_213027_172F33DB X-CRM114-Status: GOOD ( 26.31 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org A ring-buffer remote is an entity outside of the kernel (most likely a firmware or a hypervisor) capable of writing events in a ring-buffer following the same format as the tracefs ring-buffer. To setup the ring-buffer on the kernel side, a description of the pages (struct trace_page_desc) is necessary. A callback (get_reader_page) must also be provided. It is called whenever it is done reading the previous reader page. It is expected from the remote to keep the meta-page updated. Signed-off-by: Vincent Donnefort diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h index 17fbb7855295..2a1330a65edb 100644 --- a/include/linux/ring_buffer.h +++ b/include/linux/ring_buffer.h @@ -248,4 +248,59 @@ int ring_buffer_map(struct trace_buffer *buffer, int cpu, struct vm_area_struct *vma); int ring_buffer_unmap(struct trace_buffer *buffer, int cpu); int ring_buffer_map_get_reader(struct trace_buffer *buffer, int cpu); + +#define meta_pages_lost(__meta) \ + ((__meta)->Reserved1) +#define meta_pages_touched(__meta) \ + ((__meta)->Reserved2) + +struct rb_page_desc { + unsigned int cpu; + unsigned int nr_page_va; /* excludes the meta page */ + unsigned long meta_va; + unsigned long page_va[]; +}; + +struct trace_page_desc { + size_t struct_len; + unsigned int nr_cpus; + char __data[]; /* list of rb_page_desc */ +}; + +static inline +struct rb_page_desc *__next_rb_page_desc(struct rb_page_desc *pdesc) +{ + size_t len = struct_size(pdesc, page_va, pdesc->nr_page_va); + + return (struct rb_page_desc *)((void *)pdesc + len); +} + +static inline +struct rb_page_desc *__first_rb_page_desc(struct trace_page_desc *trace_pdesc) +{ + return (struct rb_page_desc *)(&trace_pdesc->__data[0]); +} + +#define for_each_rb_page_desc(__pdesc, __cpu, __trace_pdesc) \ + for (__pdesc = __first_rb_page_desc(__trace_pdesc), __cpu = 0; \ + __cpu < (__trace_pdesc)->nr_cpus; \ + __cpu++, __pdesc = __next_rb_page_desc(__pdesc)) + +struct ring_buffer_remote { + struct trace_page_desc *pdesc; + int (*get_reader_page)(int cpu); + int (*reset)(int cpu); +}; + +int ring_buffer_poll_remote(struct trace_buffer *buffer, int cpu); + +struct trace_buffer * +__ring_buffer_alloc_remote(struct ring_buffer_remote *remote, + struct lock_class_key *key); + +#define ring_buffer_remote(remote) \ +({ \ + static struct lock_class_key __key; \ + __ring_buffer_alloc_remote(remote, &__key); \ +}) #endif /* _LINUX_RING_BUFFER_H */ diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index bb6089c2951e..c27516a384a8 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -519,6 +519,8 @@ struct ring_buffer_per_cpu { struct trace_buffer_meta *meta_page; struct ring_buffer_meta *ring_meta; + struct ring_buffer_remote *remote; + /* ring buffer pages to update, > 0 to add, < 0 to remove */ long nr_pages_to_update; struct list_head new_pages; /* new pages to add */ @@ -541,6 +543,8 @@ struct trace_buffer { struct ring_buffer_per_cpu **buffers; + struct ring_buffer_remote *remote; + struct hlist_node node; u64 (*clock)(void); @@ -2155,6 +2159,41 @@ static int __rb_allocate_pages(struct ring_buffer_per_cpu *cpu_buffer, return -ENOMEM; } +static struct rb_page_desc *rb_page_desc(struct trace_page_desc *trace_pdesc, int cpu) +{ + struct rb_page_desc *pdesc, *end; + size_t len; + int i; + + if (!trace_pdesc) + return NULL; + + if (cpu >= trace_pdesc->nr_cpus) + return NULL; + + end = (struct rb_page_desc *)((void *)trace_pdesc + trace_pdesc->struct_len); + pdesc = __first_rb_page_desc(trace_pdesc); + len = struct_size(pdesc, page_va, pdesc->nr_page_va); + pdesc = (struct rb_page_desc *)((void *)pdesc + (len * cpu)); + + if (pdesc < end && pdesc->cpu == cpu) + return pdesc; + + /* Missing CPUs, need to linear search */ + for_each_rb_page_desc(pdesc, i, trace_pdesc) { + if (pdesc->cpu == cpu) + return pdesc; + } + + return NULL; +} + +static void *rb_page_desc_page(struct rb_page_desc *pdesc, int page_id) +{ + return page_id > pdesc->nr_page_va ? NULL : (void *)pdesc->page_va[page_id]; +} + + static int rb_allocate_pages(struct ring_buffer_per_cpu *cpu_buffer, unsigned long nr_pages) { @@ -2215,6 +2254,31 @@ rb_allocate_cpu_buffer(struct trace_buffer *buffer, long nr_pages, int cpu) cpu_buffer->reader_page = bpage; + if (buffer->remote) { + struct rb_page_desc *pdesc = rb_page_desc(buffer->remote->pdesc, cpu); + + if (!pdesc) + goto fail_free_reader; + + cpu_buffer->remote = buffer->remote; + cpu_buffer->meta_page = (struct trace_buffer_meta *)(void *)pdesc->meta_va; + cpu_buffer->subbuf_ids = pdesc->page_va; + cpu_buffer->nr_pages = pdesc->nr_page_va - 1; + atomic_inc(&cpu_buffer->record_disabled); + atomic_inc(&cpu_buffer->resize_disabled); + + bpage->page = rb_page_desc_page(pdesc, + cpu_buffer->meta_page->reader.id); + if (!bpage->page) + goto fail_free_reader; + /* + * The meta-page can only describe which of the ring-buffer page + * is the reader. There is no need to init the rest of the + * ring-buffer. + */ + return cpu_buffer; + } + if (buffer->range_addr_start) { /* * Range mapped buffers have the same restrictions as memory @@ -2292,6 +2356,10 @@ static void rb_free_cpu_buffer(struct ring_buffer_per_cpu *cpu_buffer) irq_work_sync(&cpu_buffer->irq_work.work); + /* remote ring-buffer. We do not own the data pages */ + if (cpu_buffer->remote) + cpu_buffer->reader_page->page = NULL; + free_buffer_page(cpu_buffer->reader_page); if (head) { @@ -2313,7 +2381,8 @@ static void rb_free_cpu_buffer(struct ring_buffer_per_cpu *cpu_buffer) static struct trace_buffer *alloc_buffer(unsigned long size, unsigned flags, int order, unsigned long start, unsigned long end, - struct lock_class_key *key) + struct lock_class_key *key, + struct ring_buffer_remote *remote) { struct trace_buffer *buffer; long nr_pages; @@ -2341,6 +2410,11 @@ static struct trace_buffer *alloc_buffer(unsigned long size, unsigned flags, buffer->flags = flags; buffer->clock = trace_clock_local; buffer->reader_lock_key = key; + if (remote) { + buffer->remote = remote; + /* The writer is remote. This ring-buffer is read-only */ + atomic_inc(&buffer->record_disabled); + } init_irq_work(&buffer->irq_work.work, rb_wake_up_waiters); init_waitqueue_head(&buffer->irq_work.waiters); @@ -2447,7 +2521,7 @@ struct trace_buffer *__ring_buffer_alloc(unsigned long size, unsigned flags, struct lock_class_key *key) { /* Default buffer page size - one system page */ - return alloc_buffer(size, flags, 0, 0, 0,key); + return alloc_buffer(size, flags, 0, 0, 0, key, NULL); } EXPORT_SYMBOL_GPL(__ring_buffer_alloc); @@ -2471,7 +2545,18 @@ struct trace_buffer *__ring_buffer_alloc_range(unsigned long size, unsigned flag unsigned long range_size, struct lock_class_key *key) { - return alloc_buffer(size, flags, order, start, start + range_size, key); + return alloc_buffer(size, flags, order, start, start + range_size, key, NULL); +} + +/** + * __ring_buffer_alloc_remote - allocate a new ring_buffer from a remote + * @remote: Contains a description of the ring-buffer pages and remote callbacks. + * @key: ring buffer reader_lock_key. + */ +struct trace_buffer *__ring_buffer_alloc_remote(struct ring_buffer_remote *remote, + struct lock_class_key *key) +{ + return alloc_buffer(0, 0, 0, 0, 0, key, remote); } /** @@ -5225,8 +5310,56 @@ rb_update_iter_read_stamp(struct ring_buffer_iter *iter, } } +static bool rb_read_remote_meta_page(struct ring_buffer_per_cpu *cpu_buffer) +{ + local_set(&cpu_buffer->entries, READ_ONCE(cpu_buffer->meta_page->entries)); + local_set(&cpu_buffer->overrun, READ_ONCE(cpu_buffer->meta_page->overrun)); + local_set(&cpu_buffer->pages_touched, READ_ONCE(meta_pages_touched(cpu_buffer->meta_page))); + local_set(&cpu_buffer->pages_lost, READ_ONCE(meta_pages_lost(cpu_buffer->meta_page))); + /* + * No need to get the "read" field, it can be tracked here as any + * reader will have to go through a rign_buffer_per_cpu. + */ + + return rb_num_of_entries(cpu_buffer); +} + static struct buffer_page * -rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) +__rb_get_reader_page_from_remote(struct ring_buffer_per_cpu *cpu_buffer) +{ + u32 prev_reader; + + if (!rb_read_remote_meta_page(cpu_buffer)) + return NULL; + + /* More to read on the reader page */ + if (cpu_buffer->reader_page->read < rb_page_size(cpu_buffer->reader_page)) { + if (!cpu_buffer->reader_page->read) + cpu_buffer->read_stamp = cpu_buffer->reader_page->page->time_stamp; + return cpu_buffer->reader_page; + } + + prev_reader = cpu_buffer->meta_page->reader.id; + + WARN_ON(cpu_buffer->remote->get_reader_page(cpu_buffer->cpu)); + /* nr_pages doesn't include the reader page */ + if (WARN_ON(cpu_buffer->meta_page->reader.id > cpu_buffer->nr_pages)) + return NULL; + + cpu_buffer->reader_page->page = + (void *)cpu_buffer->subbuf_ids[cpu_buffer->meta_page->reader.id]; + cpu_buffer->reader_page->id = cpu_buffer->meta_page->reader.id; + cpu_buffer->reader_page->read = 0; + cpu_buffer->read_stamp = cpu_buffer->reader_page->page->time_stamp; + cpu_buffer->lost_events = cpu_buffer->meta_page->reader.lost_events; + + WARN_ON(prev_reader == cpu_buffer->meta_page->reader.id); + + return rb_page_size(cpu_buffer->reader_page) ? cpu_buffer->reader_page : NULL; +} + +static struct buffer_page * +__rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) { struct buffer_page *reader = NULL; unsigned long bsize = READ_ONCE(cpu_buffer->buffer->subbuf_size); @@ -5397,6 +5530,13 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) return reader; } +static struct buffer_page * +rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) +{ + return cpu_buffer->remote ? __rb_get_reader_page_from_remote(cpu_buffer) : + __rb_get_reader_page(cpu_buffer); +} + static void rb_advance_reader(struct ring_buffer_per_cpu *cpu_buffer) { struct ring_buffer_event *event; @@ -5801,7 +5941,7 @@ ring_buffer_read_prepare(struct trace_buffer *buffer, int cpu, gfp_t flags) struct ring_buffer_per_cpu *cpu_buffer; struct ring_buffer_iter *iter; - if (!cpumask_test_cpu(cpu, buffer->cpumask)) + if (!cpumask_test_cpu(cpu, buffer->cpumask) || buffer->remote) return NULL; iter = kzalloc(sizeof(*iter), flags); @@ -5971,6 +6111,23 @@ rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer) { struct buffer_page *page; + if (cpu_buffer->remote) { + if (!cpu_buffer->remote->reset) + return; + + cpu_buffer->remote->reset(cpu_buffer->cpu); + rb_read_remote_meta_page(cpu_buffer); + + /* Read related values, not covered by the meta-page */ + local_set(&cpu_buffer->pages_read, 0); + cpu_buffer->read = 0; + cpu_buffer->read_bytes = 0; + cpu_buffer->last_overrun = 0; + cpu_buffer->reader_page->read = 0; + + return; + } + rb_head_page_deactivate(cpu_buffer); cpu_buffer->head_page @@ -6218,6 +6375,49 @@ bool ring_buffer_empty_cpu(struct trace_buffer *buffer, int cpu) } EXPORT_SYMBOL_GPL(ring_buffer_empty_cpu); +int ring_buffer_poll_remote(struct trace_buffer *buffer, int cpu) +{ + struct ring_buffer_per_cpu *cpu_buffer; + unsigned long flags; + + if (cpu != RING_BUFFER_ALL_CPUS) { + if (!cpumask_test_cpu(cpu, buffer->cpumask)) + return -EINVAL; + + cpu_buffer = buffer->buffers[cpu]; + + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); + if (rb_read_remote_meta_page(cpu_buffer)) + rb_wakeups(buffer, cpu_buffer); + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); + + return 0; + } + + /* + * Make sure all the ring buffers are up to date before we start reading + * them. + */ + for_each_buffer_cpu(buffer, cpu) { + cpu_buffer = buffer->buffers[cpu]; + + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); + rb_read_remote_meta_page(buffer->buffers[cpu]); + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); + } + + for_each_buffer_cpu(buffer, cpu) { + cpu_buffer = buffer->buffers[cpu]; + + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); + if (rb_num_of_entries(cpu_buffer)) + rb_wakeups(buffer, buffer->buffers[cpu]); + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); + } + + return 0; +} + #ifdef CONFIG_RING_BUFFER_ALLOW_SWAP /** * ring_buffer_swap_cpu - swap a CPU buffer between two ring buffers @@ -6469,6 +6669,7 @@ int ring_buffer_read_page(struct trace_buffer *buffer, unsigned int commit; unsigned int read; u64 save_timestamp; + bool force_memcpy; int ret = -1; if (!cpumask_test_cpu(cpu, buffer->cpumask)) @@ -6506,6 +6707,8 @@ int ring_buffer_read_page(struct trace_buffer *buffer, /* Check if any events were dropped */ missed_events = cpu_buffer->lost_events; + force_memcpy = cpu_buffer->mapped || cpu_buffer->remote; + /* * If this page has been partially read or * if len is not big enough to read the rest of the page or @@ -6515,7 +6718,7 @@ int ring_buffer_read_page(struct trace_buffer *buffer, */ if (read || (len < (commit - read)) || cpu_buffer->reader_page == cpu_buffer->commit_page || - cpu_buffer->mapped) { + force_memcpy) { struct buffer_data_page *rpage = cpu_buffer->reader_page->page; unsigned int rpos = read; unsigned int pos = 0; @@ -7097,7 +7300,7 @@ int ring_buffer_map(struct trace_buffer *buffer, int cpu, unsigned long flags, *subbuf_ids; int err = 0; - if (!cpumask_test_cpu(cpu, buffer->cpumask)) + if (!cpumask_test_cpu(cpu, buffer->cpumask) || buffer->remote) return -EINVAL; cpu_buffer = buffer->buffers[cpu]; From patchwork Mon Feb 24 12:13:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Donnefort X-Patchwork-Id: 13988019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 41F58C021A6 for ; Mon, 24 Feb 2025 13:14:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=OOWK+ThY516lMh047e8qidZAr1KsboApKQXs49+vyHs=; b=OJGP7B6AsYK0m9LnnYbsSvi91w 5IcrYI2Ffe7cwptUlJ0fUJQLaq7Ky2zB6N95ysMWy1gte5RFuIXFl6PxGGivnlWl2g7MIVLpCViRq qaazDhRjG9pPJ30BmcS9hXGA0YVzKJKMr93vHSgSTb2BVeR9Q09GBQcRssT/SWD9u/7JC9k+rQygv a/2tSgD9TCOMA7E8QMggAm5hD3wecoBR+dyPIFKNf5mKlrXe3tdO5lAPYbPjuIvFyDsUufen7l07W BPd6TLO47/93fHZSkWUUgGgU5gAurJpLncHIdGtb9EVOG8yOOJOWd6RJf1GVe4r70/OHUkQsWeAWQ 0K/nfiEg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmYHO-0000000Dn5W-2SeE; Mon, 24 Feb 2025 13:13:50 +0000 Received: from mail-wm1-x349.google.com ([2a00:1450:4864:20::349]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmXMv-0000000DeIs-3Lbf for linux-arm-kernel@lists.infradead.org; Mon, 24 Feb 2025 12:15:30 +0000 Received: by mail-wm1-x349.google.com with SMTP id 5b1f17b1804b1-43942e82719so32768425e9.2 for ; Mon, 24 Feb 2025 04:15:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740399328; x=1741004128; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OOWK+ThY516lMh047e8qidZAr1KsboApKQXs49+vyHs=; b=zKn68x/6Kq1NXOYQzTrbA+EyqgOb8B1R7A+YqsSy1m7GXpNk0NqgLJ1eG0O3SUEGRx gsjEY75yik6z110V58TX+dzqwCCZvautD1gl7DVLEqFqMt0ow4c1BIHolxk5BjvCgFf8 Lrw6PQITSc3FVCUSwV6KrA/Cmk3TAZoC9G9bbVF7GjxtLBAvq+kobHSmJnXxIZNsrCFf SgUYvs07Hz9vSl1JZfso4lXQMpLmgoR73xDSLaqwW38lAigA8XBqGDQtGwc7147RLfP5 WviBkSyByFNk9XSt86mULoHUzP8AjLY27rP2TOor0tb7mhMjrlERKjvwD91dfOjbcx67 kalw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740399328; x=1741004128; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OOWK+ThY516lMh047e8qidZAr1KsboApKQXs49+vyHs=; b=QsGhcHlgPW4O1gM5omyuCXsrjS+XDWMtZa1JU9BrCBCtT5eZdZo9v91MSTKntTomta dq119SFrEURleHbwYOjoUzmmwhsiLA0+H2h98iBL5hGmAbuy/ElCTiFBT6QpQY6bu5z3 Ai6LZvRxcWelMiRn/aOn7uGFRnj54oqQm5mz9BTH+a1b0kfsfrIvqTR9f/4+zh5N646T i2irhOTONNUUC6Uu+dwQdj13syMsrddHtKJrs2yEwB91Pf21HVHCXOiu5WzDuxdt+/EU q3uDhzUzu0k2Re24GDGcslda0MU8thmfBqqYllxjRowuno8Wj3PHyj15W9mWe87lPHgV hqzQ== X-Forwarded-Encrypted: i=1; AJvYcCXnGYmtzuDZYhb298XZLTqbaI0if3rhD9yk63JRzL5vCOGypmC+PfglzpyFlbS78mWgr+RcfoeacIpRG1pycizW@lists.infradead.org X-Gm-Message-State: AOJu0YxxjGxPg5wpkxvljO/BO7tX1h6kga5Vf+rezH5IW+bi/Hujac7Y Gg6J1NYMkEkHSkEP/teKY6yaXB6HGFEL31a9izqTEjWPeS5Wh0ph18RCM9gEnJO98xIxIqo3AXR l7PsEnw38354kX463hg== X-Google-Smtp-Source: AGHT+IFhLM25JrhF91PI/1xl0Ucx2pu2bGxwHFlKnZCcYx2J0q55kmjIHyBfWwosFL87FFfcYtOKkEve+OiYyvpF X-Received: from wmbfk5.prod.google.com ([2002:a05:600c:cc5:b0:439:7c0b:a4c9]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:3b08:b0:439:5da7:8e0 with SMTP id 5b1f17b1804b1-439ae1f30dbmr129078825e9.16.1740399328153; Mon, 24 Feb 2025 04:15:28 -0800 (PST) Date: Mon, 24 Feb 2025 12:13:44 +0000 In-Reply-To: <20250224121353.98697-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20250224121353.98697-1-vdonnefort@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250224121353.98697-3-vdonnefort@google.com> Subject: [PATCH 02/11] ring-buffer: Expose buffer_data_page material From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_041529_836053_0780AEAE X-CRM114-Status: GOOD ( 13.32 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org In preparation for allowing the write of ring-buffer compliant pages outside of ring_buffer.c, move to the header, struct buffer_data_page and timestamp encoding functions into the publicly available ring_buffer.h. Signed-off-by: Vincent Donnefort diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h index 2a1330a65edb..75cd0cb46768 100644 --- a/include/linux/ring_buffer.h +++ b/include/linux/ring_buffer.h @@ -3,8 +3,10 @@ #define _LINUX_RING_BUFFER_H #include -#include #include +#include + +#include #include @@ -20,6 +22,8 @@ struct ring_buffer_event { u32 array[]; }; +#define RB_EVNT_HDR_SIZE (offsetof(struct ring_buffer_event, array)) + /** * enum ring_buffer_type - internal ring buffer types * @@ -61,11 +65,50 @@ enum ring_buffer_type { RINGBUF_TYPE_TIME_STAMP, }; +#define TS_SHIFT 27 +#define TS_MASK ((1ULL << TS_SHIFT) - 1) +#define TS_DELTA_TEST (~TS_MASK) + +/* + * We need to fit the time_stamp delta into 27 bits. + */ +static inline bool test_time_stamp(u64 delta) +{ + return !!(delta & TS_DELTA_TEST); +} + unsigned ring_buffer_event_length(struct ring_buffer_event *event); void *ring_buffer_event_data(struct ring_buffer_event *event); u64 ring_buffer_event_time_stamp(struct trace_buffer *buffer, struct ring_buffer_event *event); +#define BUF_PAGE_HDR_SIZE offsetof(struct buffer_data_page, data) + +/* Max payload is BUF_PAGE_SIZE - header (8bytes) */ +#define BUF_MAX_DATA_SIZE (BUF_PAGE_SIZE - (sizeof(u32) * 2)) + +#define BUF_PAGE_SIZE (PAGE_SIZE - BUF_PAGE_HDR_SIZE) + +#define RB_ALIGNMENT 4U +#define RB_MAX_SMALL_DATA (RB_ALIGNMENT * RINGBUF_TYPE_DATA_TYPE_LEN_MAX) +#define RB_EVNT_MIN_SIZE 8U /* two 32bit words */ + +#ifndef CONFIG_HAVE_64BIT_ALIGNED_ACCESS +# define RB_FORCE_8BYTE_ALIGNMENT 0 +# define RB_ARCH_ALIGNMENT RB_ALIGNMENT +#else +# define RB_FORCE_8BYTE_ALIGNMENT 1 +# define RB_ARCH_ALIGNMENT 8U +#endif + +#define RB_ALIGN_DATA __aligned(RB_ARCH_ALIGNMENT) + +struct buffer_data_page { + u64 time_stamp; /* page time stamp */ + local_t commit; /* write committed index */ + unsigned char data[] RB_ALIGN_DATA; /* data of buffer page */ +}; + /* * ring_buffer_discard_commit will remove an event that has not * been committed yet. If this is used, then ring_buffer_unlock_commit diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index c27516a384a8..e70f39e0adb1 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -152,23 +152,6 @@ int ring_buffer_print_entry_header(struct trace_seq *s) /* Used for individual buffers (after the counter) */ #define RB_BUFFER_OFF (1 << 20) -#define BUF_PAGE_HDR_SIZE offsetof(struct buffer_data_page, data) - -#define RB_EVNT_HDR_SIZE (offsetof(struct ring_buffer_event, array)) -#define RB_ALIGNMENT 4U -#define RB_MAX_SMALL_DATA (RB_ALIGNMENT * RINGBUF_TYPE_DATA_TYPE_LEN_MAX) -#define RB_EVNT_MIN_SIZE 8U /* two 32bit words */ - -#ifndef CONFIG_HAVE_64BIT_ALIGNED_ACCESS -# define RB_FORCE_8BYTE_ALIGNMENT 0 -# define RB_ARCH_ALIGNMENT RB_ALIGNMENT -#else -# define RB_FORCE_8BYTE_ALIGNMENT 1 -# define RB_ARCH_ALIGNMENT 8U -#endif - -#define RB_ALIGN_DATA __aligned(RB_ARCH_ALIGNMENT) - /* define RINGBUF_TYPE_DATA for 'case RINGBUF_TYPE_DATA:' */ #define RINGBUF_TYPE_DATA 0 ... RINGBUF_TYPE_DATA_TYPE_LEN_MAX @@ -311,10 +294,6 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data); #define for_each_online_buffer_cpu(buffer, cpu) \ for_each_cpu_and(cpu, buffer->cpumask, cpu_online_mask) -#define TS_SHIFT 27 -#define TS_MASK ((1ULL << TS_SHIFT) - 1) -#define TS_DELTA_TEST (~TS_MASK) - static u64 rb_event_time_stamp(struct ring_buffer_event *event) { u64 ts; @@ -333,12 +312,6 @@ static u64 rb_event_time_stamp(struct ring_buffer_event *event) #define RB_MISSED_MASK (3 << 30) -struct buffer_data_page { - u64 time_stamp; /* page time stamp */ - local_t commit; /* write committed index */ - unsigned char data[] RB_ALIGN_DATA; /* data of buffer page */ -}; - struct buffer_data_read_page { unsigned order; /* order of the page */ struct buffer_data_page *data; /* actual data, stored in this page */ @@ -397,14 +370,6 @@ static void free_buffer_page(struct buffer_page *bpage) kfree(bpage); } -/* - * We need to fit the time_stamp delta into 27 bits. - */ -static inline bool test_time_stamp(u64 delta) -{ - return !!(delta & TS_DELTA_TEST); -} - struct rb_irq_work { struct irq_work work; wait_queue_head_t waiters; From patchwork Mon Feb 24 12:13:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Donnefort X-Patchwork-Id: 13988017 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 382C4C021BB for ; Mon, 24 Feb 2025 13:13:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=boWUnC75AglyvSxymfq6WOxHhiCHfMqe3a7p6Cadurk=; b=UqQyH6WqN9JlWR8KsAyzf1AV/K nDslbHzJd0QOk1Y3A7WVVmFDSbN0eyZri20bR150VoxSNj0BYPMcM6+jGsoXxt9ks2z2ko3l+oOpd p3u8fMps0BT+KGrCSMKj1iEu88e7e8mokC4pqUxphPEHHbSTezaHA22JlgNtGDDehXNVu7EDUYsSa l2Lz8XL1kkYSoX+Zo+/dv3aNoerBvi++6uRmfOpqojH7cW5CEDsfkj2slRrtl6eAcxXFQ0mzRMs7C xueWCRmmpfLFINrIIgIlalT9nP/eM7c8O6LMvsubIB8vqrxYytzVqLsGB1TTbieXV1Ek8RTtO80zO dlhUGjrw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmYHP-0000000Dn66-0fUx; Mon, 24 Feb 2025 13:13:51 +0000 Received: from mail-wm1-x349.google.com ([2a00:1450:4864:20::349]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmXMx-0000000DeJZ-3gB2 for linux-arm-kernel@lists.infradead.org; Mon, 24 Feb 2025 12:15:33 +0000 Received: by mail-wm1-x349.google.com with SMTP id 5b1f17b1804b1-4394b2c19ccso33627225e9.1 for ; Mon, 24 Feb 2025 04:15:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740399330; x=1741004130; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=boWUnC75AglyvSxymfq6WOxHhiCHfMqe3a7p6Cadurk=; b=w6J7MxyWcr8CgfF4ooxdJCZyecJs6xqvIKjhTNcb4cIzKY2akUZPU8JuK6z5vVN9RX s7LH5RE5VdCWBbPU20I/vfMvbF4d+Iidk2NT3ohhB5tZMNyHuFJVbTBdAJwmvTR7SeNd XeYTMb5UP8SdbuFRxyUEqDLshW4RQ7zlWWnyoZFklvRGKBi0K4w981Ydq1NTdqpjbtb4 /bh+x7BLB5uxmLRKcqIDRfpNcR1pgqIFCknkzVxXNG9jPRKC53Px8OjS2zUoTllQYca4 /GYn5YkOKVtHa3sf5Hl7hkJWzHHU9w6Qg83AtRHHO5ctrI3SfUS/dY2kR7MYklfRMxL0 mejg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740399330; x=1741004130; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=boWUnC75AglyvSxymfq6WOxHhiCHfMqe3a7p6Cadurk=; b=cxdC6aPqgIxlEjSn4NF8JIYpnrlG+9zjYbuVLowh3HeAAE70H5+Q9C3qIJ2N9c8TqP Q1T1fgOWysr0n+HOJRFhfRsMBmiXyjgzhvWcUGZdCRy+p3ETpmiujdIOh0s9enNd4/pd 9oHpU5XUT3xRjsD0wG7cER4gpdo5fo5Ft/GW/AbaNLweTR9g7bl8dx9vLNqduUH6NdUW FXHbGMCqsUZYNkMv5FI4TeEFFhYKjxd93tg7jDVVYo7zLr9fN1VF+6Bo3lsLFhwW/Wtm OIrGF/iecTBhWuglgYZZPYeHQBF9zui536jMr3cB0H8wzgfp2x71dptzwXXkP1HgFWP3 5UfA== X-Forwarded-Encrypted: i=1; AJvYcCWyn6ecJtdMJ5pcG6LL+LcAYfWPz+pZ5HMXiEjMBgiGYaPY7PhnJMA+MyMq+6DfJFR1gcwm+7hcwFyU56GQxDxY@lists.infradead.org X-Gm-Message-State: AOJu0YwpmqwSLecYOobC1VOGv1Ln4fWchcvUCbmCH2O/o/XJt70RkFSe A086dfZQhFB90OSfvjN4pfYxRbokyWHBZHhDabkUgmPZN4n9Ct+4+DZoRPOoC6DkTOoRn3AQYky BM35U0pchLGTEfOHEPg== X-Google-Smtp-Source: AGHT+IEZCk3oy+HQSHrsEDAqwXlupd0xHmeS8IuL0N9dhMMAdmGlo9OyR+iiK51V+OKl7+wJfLEr+VwrJuInfXQr X-Received: from wmbjg14.prod.google.com ([2002:a05:600c:a00e:b0:439:4366:35c2]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4f88:b0:439:96aa:e502 with SMTP id 5b1f17b1804b1-439ae1e6a2fmr119293615e9.12.1740399330189; Mon, 24 Feb 2025 04:15:30 -0800 (PST) Date: Mon, 24 Feb 2025 12:13:45 +0000 In-Reply-To: <20250224121353.98697-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20250224121353.98697-1-vdonnefort@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250224121353.98697-4-vdonnefort@google.com> Subject: [PATCH 03/11] KVM: arm64: Support unaligned fixmap in the nVHE hyp From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_041531_911231_11337C3E X-CRM114-Status: UNSURE ( 7.14 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Return the fixmap VA with the page offset, instead of the page base address. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c index f41c7440b34b..720cc3b36596 100644 --- a/arch/arm64/kvm/hyp/nvhe/mm.c +++ b/arch/arm64/kvm/hyp/nvhe/mm.c @@ -240,7 +240,7 @@ void *hyp_fixmap_map(phys_addr_t phys) WRITE_ONCE(*ptep, pte); dsb(ishst); - return (void *)slot->addr; + return (void *)slot->addr + offset_in_page(phys); } static void fixmap_clear_slot(struct hyp_fixmap_slot *slot) From patchwork Mon Feb 24 12:13:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Donnefort X-Patchwork-Id: 13988021 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5611CC19776 for ; Mon, 24 Feb 2025 13:14:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=8ejK/8C7FFyT3tSgpfGWkSSasOGDMcKAf9YWEPj8A/I=; b=sNSq+DnV7C6u22BWG6ydYaSyiP 4CYEFNMAJVoi4bMBn7MtgU2r3dfSLRUV0893f21Pskb10dNIAwzPyo2hihUJ9/qXhcebqNWby/o0H UcaD9BAVajEUZfJcTA4xfIUwxt3ZmV3enh7f8Uy3uEfdsc37wSagg25W7Wi8hzxrVplrkvWOiRR2L oYRzERjlYQKim9AfPjDdCApyuVQ+ShI9Ar2Duc1RaZoc4rdZe1gio+yDyc9/OA1y1p9t2pwznKOyN FhUXG9/2/x+489HDEZVZ//OvbzHVnapqT7I2ZOJz2GY6rMCcdW4ozP48ihCd7kmkCsOhKhR4RDgNf hpUvpdxw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmYHP-0000000Dn6X-3C1E; Mon, 24 Feb 2025 13:13:51 +0000 Received: from mail-wm1-x349.google.com ([2a00:1450:4864:20::349]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmXMz-0000000DeLE-32u7 for linux-arm-kernel@lists.infradead.org; Mon, 24 Feb 2025 12:15:35 +0000 Received: by mail-wm1-x349.google.com with SMTP id 5b1f17b1804b1-4398a60b61fso22055735e9.0 for ; Mon, 24 Feb 2025 04:15:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740399332; x=1741004132; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8ejK/8C7FFyT3tSgpfGWkSSasOGDMcKAf9YWEPj8A/I=; b=3f8zzn/Yqi9Dnhrn7rAYkSLM/Z1G0al/KdKs4C0EPq2UTKOKH8KA68sQoLdRbCBFiO NIxsHioye80sPB/izM+ZtfFstBf49yPupwh6vvLyaVsH9KHr3PSZQ4cmk6hTKxAYjbS7 3bBC4x4s1+T2foY9UzhQmkLlrszAu8mC5YI2bMVL2wo61RRejLldJ63CPItEVnIy9BhV /Hea2/v9i4LwR2HUUKI5F+o+Kp888EWZkUtCEguTeu02oIZxXnIdx+fyYSSzWO7tLx6z 2OsAlyxxh2GS065YlWctERgXplwbFOx5dhWKX7edkOl1LcHDI05UmpwMAVHSnicTrdHX Sz8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740399332; x=1741004132; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8ejK/8C7FFyT3tSgpfGWkSSasOGDMcKAf9YWEPj8A/I=; b=VSOnp5MHtm7aZPeh9fQgviu2cQeE2T/oQuVeHKm90+Obf9WNUZEfZUXWGm+lvOxtSA Qw0qk33BFWVoN66lMhZ7Vvrv1POsXqmZoNef+LITWflC/vJ1WymbW7wvyV8eBxUEcxC8 g+nSQjZelnJHGfYEjHgy7sOA1shJ1n/p6cQN0gP2H22jUSbvxdrOE+YBFafE1jsVYHxN N/8pllZ28oHrR4x9FdvaKTFY6BaOFpg0HhzCkVy0xTEZMWBQzzg45n6T0ZrPR+wdOfoL wFesD+4bgkPLuQ6vV1gJ3waQF1Pqzhicw/Va/rZrbUI43mNu1tOPsKQsmK5x40SEIOl3 wYhg== X-Forwarded-Encrypted: i=1; AJvYcCWPjxiuJ5Naa6Cdn93OFDg3GwxogmzrYTRpZvTTCCuEda4mm/SYvFB7RlOb/fyA/RiQO+DiKe0BwbTOwuXvqZMB@lists.infradead.org X-Gm-Message-State: AOJu0Ywi8pluU78+Z9xX7mj9cCqHa1UZs9fU2BSYOMaS8iIaY5icHw6/ nXJJ1oQQmTUij/uK0+jXuRhy0R434eW0BOgsQn6eyBiKp52EGLE6RzZLPiO5H0OWh7A3S7vCe85 D8UmJrGdqAHHQQSqwFg== X-Google-Smtp-Source: AGHT+IGeFNtDkwoWNu5EnPAqlAhToiw57l/wri43FL+yg/iv8i7KY5/rJKLalMmMg9RRoQsj6qXyYOVqroYVfbye X-Received: from wmqa20.prod.google.com ([2002:a05:600c:3494:b0:439:8b09:7257]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:674e:b0:439:9a5a:d3c4 with SMTP id 5b1f17b1804b1-439af7f97a4mr100497985e9.2.1740399332141; Mon, 24 Feb 2025 04:15:32 -0800 (PST) Date: Mon, 24 Feb 2025 12:13:46 +0000 In-Reply-To: <20250224121353.98697-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20250224121353.98697-1-vdonnefort@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250224121353.98697-5-vdonnefort@google.com> Subject: [PATCH 04/11] KVM: arm64: Add clock support in the nVHE hyp From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_041533_769868_D6E58526 X-CRM114-Status: GOOD ( 17.11 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org By default, the arm64 host kernel is using the arch timer as a source for sched_clock. Conveniently, EL2 has access to that same counter, allowing to generate clock values that are synchronized. The clock needs nonetheless to be setup with the same slope values as the kernel. Introducing at the same time trace_clock() which is expected to be later configured by the hypervisor tracing. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index c838309e4ec4..355bae0056f0 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -144,5 +144,4 @@ extern u64 kvm_nvhe_sym(id_aa64smfr0_el1_sys_val); extern unsigned long kvm_nvhe_sym(__icache_flags); extern unsigned int kvm_nvhe_sym(kvm_arm_vmid_bits); extern unsigned int kvm_nvhe_sym(kvm_host_sve_max_vl); - #endif /* __ARM64_KVM_HYP_H__ */ diff --git a/arch/arm64/kvm/hyp/include/nvhe/clock.h b/arch/arm64/kvm/hyp/include/nvhe/clock.h new file mode 100644 index 000000000000..2bd05b3b89f9 --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/clock.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ARM64_KVM_HYP_NVHE_CLOCK_H +#define __ARM64_KVM_HYP_NVHE_CLOCK_H +#include + +#include + +#ifdef CONFIG_TRACING +void trace_clock_update(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc); +u64 trace_clock(void); +#else +static inline void +trace_clock_update(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc) { } +static inline u64 trace_clock(void) { return 0; } +#endif +#endif diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile index b43426a493df..323e992089bd 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -28,6 +28,7 @@ hyp-obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o hyp-obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \ ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o hyp-obj-$(CONFIG_LIST_HARDENED) += list_debug.o +hyp-obj-$(CONFIG_TRACING) += clock.o hyp-obj-y += $(lib-objs) ## diff --git a/arch/arm64/kvm/hyp/nvhe/clock.c b/arch/arm64/kvm/hyp/nvhe/clock.c new file mode 100644 index 000000000000..879c6b09d9ca --- /dev/null +++ b/arch/arm64/kvm/hyp/nvhe/clock.c @@ -0,0 +1,65 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2025 Google LLC + * Author: Vincent Donnefort + */ + +#include + +#include +#include + +static struct clock_data { + struct { + u32 mult; + u32 shift; + u64 epoch_ns; + u64 epoch_cyc; + u64 cyc_overflow64; + } data[2]; + u64 cur; +} trace_clock_data; + +static u64 __clock_mult_uint128(u64 cyc, u32 mult, u32 shift) +{ + __uint128_t ns = (__uint128_t)cyc * mult; + + ns >>= shift; + + return (u64)ns; +} + +/* Does not guarantee no reader on the modified bank. */ +void trace_clock_update(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc) +{ + struct clock_data *clock = &trace_clock_data; + u64 bank = clock->cur ^ 1; + + clock->data[bank].mult = mult; + clock->data[bank].shift = shift; + clock->data[bank].epoch_ns = epoch_ns; + clock->data[bank].epoch_cyc = epoch_cyc; + clock->data[bank].cyc_overflow64 = ULONG_MAX / mult; + + smp_store_release(&clock->cur, bank); +} + +/* Using host provided data. Do not use for anything else than debugging. */ +u64 trace_clock(void) +{ + struct clock_data *clock = &trace_clock_data; + u64 bank = smp_load_acquire(&clock->cur); + u64 cyc, ns; + + cyc = __arch_counter_get_cntpct() - clock->data[bank].epoch_cyc; + + if (likely(cyc < clock->data[bank].cyc_overflow64)) { + ns = cyc * clock->data[bank].mult; + ns >>= clock->data[bank].shift; + } else { + ns = __clock_mult_uint128(cyc, clock->data[bank].mult, + clock->data[bank].shift); + } + + return (u64)ns + clock->data[bank].epoch_ns; +} From patchwork Mon Feb 24 12:13:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Donnefort X-Patchwork-Id: 13988024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5D4FCC021B5 for ; Mon, 24 Feb 2025 13:14:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=aUpfaQpj04iJFsYYTcEuP7scbGKL/aNHjshEDK/nRSA=; b=R11w9x+J92kp1PYGf3DgU2Suvx 8h6FpFDVfSH29wllpz4kKiLYdihcvKA5aGicVWtvgy7wiDtkzz+fIDMMbyPJn3I6ctw/03sYvZW7F Okt/58D52NMnEd2Jn6Yjqm6CloffcF9BO7Fk69+p8NBcX86J414xtRqvxCFSP102neZq6UVR9sjPl yEijG6h7wAZ+WoguY0Q6ZpoGVLGk0A7IyiZd1523FbPb7jRo0CbujpTQxktYZq4Gi2qs7uaIBx9e1 jOM3rTAdxSHzUohPAKUWH9Sp73zP+yIC0lNx/tl8V84nOkD0ZUQPgyWN2hgqXZhQrJ9D01RrtkCnM cmL4Mwsw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmYHQ-0000000Dn6u-1Rq2; Mon, 24 Feb 2025 13:13:52 +0000 Received: from mail-wm1-x34a.google.com ([2a00:1450:4864:20::34a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmXN2-0000000DeMF-15bb for linux-arm-kernel@lists.infradead.org; Mon, 24 Feb 2025 12:15:37 +0000 Received: by mail-wm1-x34a.google.com with SMTP id 5b1f17b1804b1-4393b6763a3so18208445e9.2 for ; Mon, 24 Feb 2025 04:15:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740399334; x=1741004134; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=aUpfaQpj04iJFsYYTcEuP7scbGKL/aNHjshEDK/nRSA=; b=bqOZVNrdrnUksq1LBfWdn/U3XU7egnkffdPWLRvX4gqF2tTGEXAhtpai3iqzLrauW7 GkepW+V9YMCPUfAoWEYRnxWxMGy9CBityDbuXsx0G1vFt3Ql50/jGNY1Po1yaBJxtx19 EE9Ow97vyNBHiQn1+wETlp2SoMVi2ohJgoJFd5jSfbVOuFRX6pa+/JG3WRTKNP/g9QOa GhOWYzk3eFCV8yJNwT1KE2o6VpFYNDYDIkezUpV3K3fy2fXgwl5QsXyz0d5YJM8jL7Z/ eLH4o+NqZow9Ke+BMFH+q+kYJF3Lp9XtRxzdJqSsFuz5CUGPhIf/bmdZ2Gj9/+WoAoDp pe1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740399334; x=1741004134; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=aUpfaQpj04iJFsYYTcEuP7scbGKL/aNHjshEDK/nRSA=; b=E5p3BUMt0SkvKVPyBjH4T1SwaVUhkmC5EZ5nmZRcVJSPz45MNfsiSl7uVAh3LmteA0 Obat0zfkNfezj4uZZ91OG4dDfQeiHpGc0IIdpoFLb3y/siH0HO98yiRZTyINnOMqk0lC PkTrHyR2ZoTmDADM++5HEZPYr0CTHXP0sW7CBnu55pY9qPQXv3vTc8TvP1AIB8KZ8qmD S/tMENmQ6WLk6z1uVoQ/iwqdx6L3u0tF1mfgzJuLNaRb+mJzvpopOUplhoeIY8M4saB/ nF7uO64bWg9bCign/Zrp8m/nBrHW333aylUpAzJPBumtLhY6/WkblAjpcWy2Ibb2zyez ySJQ== X-Forwarded-Encrypted: i=1; AJvYcCWUz0gbBKPDYMiPpoP7icxy3UhYMwxDxuD2FkOkWidkV2T+8KYoQnsmpLg5TUgC9Ikl3EIpLtnNRtkGZBp6TqoD@lists.infradead.org X-Gm-Message-State: AOJu0Yx8zjqElJsZp4AU8fZQCOsbChPoIJCGbtRoB+AAIE9ZqcvVdfrU 3vfun2hxaFkDoKr3GRc8wioOzY0tiq+aZDtsO7Kt4/KYJvAZSJP/CcrdS9MJZQ4r2aZY2tzKxCP 8SV1M1ZB/EDGMDgvtiw== X-Google-Smtp-Source: AGHT+IHgUPH89QePxBj29K/nOYsLmMYjAgZct1FEMGw34skcpaq0UyS4hmX+PAG9Ke/XH2yyDW17L/TIM8+hJ8y8 X-Received: from wmbep9.prod.google.com ([2002:a05:600c:8409:b0:439:65f0:b039]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:19c9:b0:439:9a5b:87d4 with SMTP id 5b1f17b1804b1-439ae1f3a3dmr102587245e9.13.1740399334225; Mon, 24 Feb 2025 04:15:34 -0800 (PST) Date: Mon, 24 Feb 2025 12:13:47 +0000 In-Reply-To: <20250224121353.98697-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20250224121353.98697-1-vdonnefort@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250224121353.98697-6-vdonnefort@google.com> Subject: [PATCH 05/11] KVM: arm64: Add tracing support for the pKVM hyp From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_041536_307908_2E84ED53 X-CRM114-Status: GOOD ( 23.36 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org When running with protected mode, the host has very little knowledge about what is happening in the hypervisor. Of course this is an essential feature for security but nonetheless, that piece of code growing with more responsibilities, we need now a way to debug and profile it. Tracefs by its reliatility, versatility and support for user-space is the perfect tool. There's no way the hypervisor could log events directly into the host tracefs ring-buffers. So instead let's use our own, where the hypervisor is the writer and the host the reader. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index bec227f9500a..b5893e0afe8e 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -87,6 +87,10 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_load, __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_put, __KVM_HOST_SMCCC_FUNC___pkvm_tlb_flush_vmid, + __KVM_HOST_SMCCC_FUNC___pkvm_load_tracing, + __KVM_HOST_SMCCC_FUNC___pkvm_teardown_tracing, + __KVM_HOST_SMCCC_FUNC___pkvm_enable_tracing, + __KVM_HOST_SMCCC_FUNC___pkvm_swap_reader_tracing, }; #define DECLARE_KVM_VHE_SYM(sym) extern char sym[] diff --git a/arch/arm64/include/asm/kvm_hyptrace.h b/arch/arm64/include/asm/kvm_hyptrace.h new file mode 100644 index 000000000000..7da6a248c7fa --- /dev/null +++ b/arch/arm64/include/asm/kvm_hyptrace.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __ARM64_KVM_HYPTRACE_H_ +#define __ARM64_KVM_HYPTRACE_H_ +#include + +#include + +/* + * Host donations to the hypervisor to store the struct hyp_buffer_page. + */ +struct hyp_buffer_pages_backing { + unsigned long start; + size_t size; +}; + +struct hyp_trace_desc { + struct hyp_buffer_pages_backing backing; + struct trace_page_desc page_desc; + +}; +#endif diff --git a/arch/arm64/kvm/hyp/include/nvhe/trace.h b/arch/arm64/kvm/hyp/include/nvhe/trace.h new file mode 100644 index 000000000000..bf74a6ee322d --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/trace.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __ARM64_KVM_HYP_NVHE_TRACE_H +#define __ARM64_KVM_HYP_NVHE_TRACE_H +#include + +/* Internal struct exported for hyp-constants.c */ +struct hyp_buffer_page { + struct list_head list; + struct buffer_data_page *page; + u64 entries; + u32 write; + u32 id; +}; + +#ifdef CONFIG_TRACING +void *tracing_reserve_entry(unsigned long length); +void tracing_commit_entry(void); + +int __pkvm_load_tracing(unsigned long desc_va, size_t desc_size); +void __pkvm_teardown_tracing(void); +int __pkvm_enable_tracing(bool enable); +int __pkvm_swap_reader_tracing(unsigned int cpu); +#else +static inline void *tracing_reserve_entry(unsigned long length) { return NULL; } +static inline void tracing_commit_entry(void) { } + +static inline int __pkvm_load_tracing(unsigned long desc_va, size_t desc_size) { return -ENODEV; } +static inline void __pkvm_teardown_tracing(void) { } +static inline int __pkvm_enable_tracing(bool enable) { return -ENODEV; } +static inline int __pkvm_swap_reader_tracing(unsigned int cpu) { return -ENODEV; } +#endif +#endif diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile index 323e992089bd..40f243c44cf5 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -28,7 +28,7 @@ hyp-obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o hyp-obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \ ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o hyp-obj-$(CONFIG_LIST_HARDENED) += list_debug.o -hyp-obj-$(CONFIG_TRACING) += clock.o +hyp-obj-$(CONFIG_TRACING) += clock.o trace.o hyp-obj-y += $(lib-objs) ## diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 2c37680d954c..ced0a161d56e 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -18,6 +18,7 @@ #include #include #include +#include #include DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params); @@ -570,6 +571,35 @@ static void handle___pkvm_teardown_vm(struct kvm_cpu_context *host_ctxt) cpu_reg(host_ctxt, 1) = __pkvm_teardown_vm(handle); } +static void handle___pkvm_load_tracing(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(unsigned long, desc_hva, host_ctxt, 1); + DECLARE_REG(size_t, desc_size, host_ctxt, 2); + + cpu_reg(host_ctxt, 1) = __pkvm_load_tracing(desc_hva, desc_size); +} + +static void handle___pkvm_teardown_tracing(struct kvm_cpu_context *host_ctxt) +{ + __pkvm_teardown_tracing(); + + cpu_reg(host_ctxt, 1) = 0; +} + +static void handle___pkvm_enable_tracing(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(bool, enable, host_ctxt, 1); + + cpu_reg(host_ctxt, 1) = __pkvm_enable_tracing(enable); +} + +static void handle___pkvm_swap_reader_tracing(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(unsigned int, cpu, host_ctxt, 1); + + cpu_reg(host_ctxt, 1) = __pkvm_swap_reader_tracing(cpu); +} + typedef void (*hcall_t)(struct kvm_cpu_context *); #define HANDLE_FUNC(x) [__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x @@ -609,6 +639,10 @@ static const hcall_t host_hcall[] = { HANDLE_FUNC(__pkvm_vcpu_load), HANDLE_FUNC(__pkvm_vcpu_put), HANDLE_FUNC(__pkvm_tlb_flush_vmid), + HANDLE_FUNC(__pkvm_load_tracing), + HANDLE_FUNC(__pkvm_teardown_tracing), + HANDLE_FUNC(__pkvm_enable_tracing), + HANDLE_FUNC(__pkvm_swap_reader_tracing), }; static void handle_host_hcall(struct kvm_cpu_context *host_ctxt) diff --git a/arch/arm64/kvm/hyp/nvhe/trace.c b/arch/arm64/kvm/hyp/nvhe/trace.c new file mode 100644 index 000000000000..4611cef64566 --- /dev/null +++ b/arch/arm64/kvm/hyp/nvhe/trace.c @@ -0,0 +1,558 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2025 Google LLC + * Author: Vincent Donnefort + */ + +#include +#include +#include +#include + +#include +#include +#include + +struct hyp_rb_per_cpu { + struct trace_buffer_meta *meta; + struct hyp_buffer_page *tail_page; + struct hyp_buffer_page *reader_page; + struct hyp_buffer_page *head_page; + struct hyp_buffer_page *bpages; + u32 nr_pages; + u32 status; + u64 last_overrun; + u64 write_stamp; +}; + +#define HYP_RB_UNAVAILABLE 0 +#define HYP_RB_READY 1 +#define HYP_RB_WRITING 2 + +static struct hyp_buffer_pages_backing hyp_buffer_pages_backing; +static DEFINE_PER_CPU(struct hyp_rb_per_cpu, trace_rb); +static DEFINE_HYP_SPINLOCK(trace_rb_lock); + +#define HYP_BPAGE_LINK_HEAD 1UL +#define HYP_BPAGE_LINK_MASK ~HYP_BPAGE_LINK_HEAD + +static bool hyp_bpage_try_shunt_link(struct hyp_buffer_page *bpage, struct hyp_buffer_page *dst, + unsigned long old_flags, unsigned long flags) +{ + unsigned long *ptr = (unsigned long *)(&bpage->list.next); + unsigned long old = (*ptr & HYP_BPAGE_LINK_MASK) | old_flags; + unsigned long new = (unsigned long)(&dst->list) | flags; + + return cmpxchg(ptr, old, new) == old; +} + +static void hyp_bpage_set_link_flag(struct hyp_buffer_page *bpage, unsigned long flag) +{ + bpage->list.next = (struct list_head *) + (((unsigned long)bpage->list.next & HYP_BPAGE_LINK_MASK) | flag); +} + +static struct hyp_buffer_page *hyp_bpage_from_link(struct list_head *list) +{ + unsigned long ptr = (unsigned long)list & HYP_BPAGE_LINK_MASK; + + return container_of((struct list_head *)ptr, struct hyp_buffer_page, list); +} + +static struct hyp_buffer_page *hyp_bpage_next_page(struct hyp_buffer_page *bpage) +{ + return hyp_bpage_from_link(bpage->list.next); +} + +static bool hyp_bpage_is_head(struct hyp_buffer_page *bpage) +{ + return (unsigned long)bpage->list.prev->next & HYP_BPAGE_LINK_HEAD; +} + +static void hyp_bpage_reset(struct hyp_buffer_page *bpage) +{ + bpage->write = 0; + bpage->entries = 0; + + local_set(&bpage->page->commit, 0); +} + +static int hyp_bpage_init(struct hyp_buffer_page *bpage, unsigned long hva) +{ + void *hyp_va = (void *)kern_hyp_va(hva); + int ret; + + ret = hyp_pin_shared_mem(hyp_va, hyp_va + PAGE_SIZE); + if (ret) + return ret; + + INIT_LIST_HEAD(&bpage->list); + bpage->page = (struct buffer_data_page *)hyp_va; + + hyp_bpage_reset(bpage); + + return 0; +} + +#define hyp_rb_meta_inc(__meta, __inc) \ + WRITE_ONCE((__meta), (__meta + __inc)) + +static bool hyp_rb_loaded(struct hyp_rb_per_cpu *cpu_buffer) +{ + return !!cpu_buffer->bpages; +} + +static int hyp_rb_swap_reader(struct hyp_rb_per_cpu *cpu_buffer) +{ + struct hyp_buffer_page *last, *head, *reader; + unsigned long overrun; + + if (!hyp_rb_loaded(cpu_buffer)) + return -ENODEV; + + head = cpu_buffer->head_page; + reader = cpu_buffer->reader_page; + + do { + /* Run after the writer to find the head */ + while (!hyp_bpage_is_head(head)) + cpu_buffer->head_page = head = hyp_bpage_next_page(head); + + /* Connect the reader page around the header page */ + reader->list.next = head->list.next; + reader->list.prev = head->list.prev; + + /* The last page before the head */ + last = hyp_bpage_from_link(reader->list.next); + + /* The reader page points to the new header page */ + hyp_bpage_set_link_flag(reader, HYP_BPAGE_LINK_HEAD); + + overrun = smp_load_acquire(&cpu_buffer->meta->overrun); + } while (!hyp_bpage_try_shunt_link(last, reader, HYP_BPAGE_LINK_HEAD, 0)); + + cpu_buffer->head_page = hyp_bpage_from_link(reader->list.next); + cpu_buffer->head_page->list.prev = &reader->list; + cpu_buffer->reader_page = head; + cpu_buffer->meta->reader.lost_events = overrun - cpu_buffer->last_overrun; + cpu_buffer->meta->reader.id = cpu_buffer->reader_page->id; + cpu_buffer->last_overrun = overrun; + + return 0; +} + +static struct hyp_buffer_page *hyp_rb_move_tail(struct hyp_rb_per_cpu *cpu_buffer) +{ + struct hyp_buffer_page *tail, *new_tail; + + tail = cpu_buffer->tail_page; + new_tail = hyp_bpage_next_page(tail); + + if (hyp_bpage_try_shunt_link(tail, new_tail, HYP_BPAGE_LINK_HEAD, 0)) { + /* + * Oh no! we've caught the head. There is none anymore and swap_reader will spin + * until we set the new one. Overrun must be written first, to make sure we report + * the correct number of lost events. + */ + hyp_rb_meta_inc(cpu_buffer->meta->overrun, new_tail->entries); + hyp_rb_meta_inc(meta_pages_lost(cpu_buffer->meta), 1); + + smp_store_release(&new_tail->list.next, + (unsigned long)new_tail->list.next | HYP_BPAGE_LINK_HEAD); + } + + hyp_bpage_reset(new_tail); + cpu_buffer->tail_page = new_tail; + + hyp_rb_meta_inc(meta_pages_touched(cpu_buffer->meta), 1); + + return new_tail; +} + +static unsigned long rb_event_size(unsigned long length) +{ + struct ring_buffer_event *event; + + return length + RB_EVNT_HDR_SIZE + sizeof(event->array[0]); +} + +static struct ring_buffer_event * +rb_add_ts_extend(struct ring_buffer_event *event, u64 delta) +{ + event->type_len = RINGBUF_TYPE_TIME_EXTEND; + event->time_delta = delta & TS_MASK; + event->array[0] = delta >> TS_SHIFT; + + return (struct ring_buffer_event *)((unsigned long)event + 8); +} + +static struct ring_buffer_event * +hyp_rb_reserve_next(struct hyp_rb_per_cpu *cpu_buffer, unsigned long length) +{ + unsigned long ts_ext_size = 0, event_size = rb_event_size(length); + struct hyp_buffer_page *tail = cpu_buffer->tail_page; + struct ring_buffer_event *event; + u32 write, prev_write; + u64 ts, time_delta; + + ts = trace_clock(); + + time_delta = ts - cpu_buffer->write_stamp; + + if (test_time_stamp(time_delta)) + ts_ext_size = 8; + + prev_write = tail->write; + write = prev_write + event_size + ts_ext_size; + + if (unlikely(write > BUF_PAGE_SIZE)) + tail = hyp_rb_move_tail(cpu_buffer); + + if (!tail->entries) { + tail->page->time_stamp = ts; + time_delta = 0; + ts_ext_size = 0; + write = event_size; + prev_write = 0; + } + + tail->write = write; + tail->entries++; + + cpu_buffer->write_stamp = ts; + + event = (struct ring_buffer_event *)(tail->page->data + prev_write); + if (ts_ext_size) { + event = rb_add_ts_extend(event, time_delta); + time_delta = 0; + } + + event->type_len = 0; + event->time_delta = time_delta; + event->array[0] = event_size - RB_EVNT_HDR_SIZE; + + return event; +} + +void *tracing_reserve_entry(unsigned long length) +{ + struct hyp_rb_per_cpu *cpu_buffer = this_cpu_ptr(&trace_rb); + struct ring_buffer_event *rb_event; + + if (cmpxchg(&cpu_buffer->status, HYP_RB_READY, HYP_RB_WRITING) != HYP_RB_READY) + return NULL; + + rb_event = hyp_rb_reserve_next(cpu_buffer, length); + + return &rb_event->array[1]; +} + +void tracing_commit_entry(void) +{ + struct hyp_rb_per_cpu *cpu_buffer = this_cpu_ptr(&trace_rb); + + local_set(&cpu_buffer->tail_page->page->commit, + cpu_buffer->tail_page->write); + hyp_rb_meta_inc(cpu_buffer->meta->entries, 1); + + /* + * Paired with hyp_rb_disable_writing() to ensure data is + * written to the ring-buffer before teardown. + */ + smp_store_release(&cpu_buffer->status, HYP_RB_READY); +} + +static void hyp_rb_disable_writing(struct hyp_rb_per_cpu *cpu_buffer) +{ + u32 prev_status; + + /* Wait for the buffer to be released */ + do { + prev_status = cmpxchg_acquire(&cpu_buffer->status, + HYP_RB_READY, + HYP_RB_UNAVAILABLE); + } while (prev_status == HYP_RB_WRITING); +} + +static int hyp_rb_enable_writing(struct hyp_rb_per_cpu *cpu_buffer) +{ + if (!hyp_rb_loaded(cpu_buffer)) + return -ENODEV; + + cmpxchg(&cpu_buffer->status, HYP_RB_UNAVAILABLE, HYP_RB_READY); + + return 0; +} + +static void hyp_rb_teardown(struct hyp_rb_per_cpu *cpu_buffer) +{ + int i; + + if (!hyp_rb_loaded(cpu_buffer)) + return; + + hyp_rb_disable_writing(cpu_buffer); + + hyp_unpin_shared_mem((void *)cpu_buffer->meta, + (void *)(cpu_buffer->meta) + PAGE_SIZE); + + for (i = 0; i < cpu_buffer->nr_pages; i++) { + struct hyp_buffer_page *bpage = &cpu_buffer->bpages[i]; + + if (!bpage->page) + continue; + + hyp_unpin_shared_mem((void *)bpage->page, + (void *)bpage->page + PAGE_SIZE); + } + + cpu_buffer->bpages = 0; +} + +static bool hyp_rb_fits_backing(u32 nr_pages, struct hyp_buffer_page *start) +{ + unsigned long max = hyp_buffer_pages_backing.start + + hyp_buffer_pages_backing.size; + struct hyp_buffer_page *end = start + nr_pages; + + return (unsigned long)end <= max; +} + +static int hyp_rb_init(struct rb_page_desc *pdesc, struct hyp_buffer_page *start, + struct hyp_rb_per_cpu *cpu_buffer) +{ + struct hyp_buffer_page *bpage = start; + int i, ret; + + /* At least 1 reader page and one head */ + if (pdesc->nr_page_va < 2) + return -EINVAL; + + /* nr_page_va + 1 must fit nr_pages */ + if (pdesc->nr_page_va >= U32_MAX) + return -EINVAL; + + if (!hyp_rb_fits_backing(pdesc->nr_page_va, start)) + return -EINVAL; + + if (hyp_rb_loaded(cpu_buffer)) + return -EBUSY; + + cpu_buffer->bpages = start; + + cpu_buffer->meta = (struct trace_buffer_meta *)kern_hyp_va(pdesc->meta_va); + ret = hyp_pin_shared_mem((void *)cpu_buffer->meta, + ((void *)cpu_buffer->meta) + PAGE_SIZE); + if (ret) + return ret; + + memset(cpu_buffer->meta, 0, sizeof(*cpu_buffer->meta)); + cpu_buffer->meta->meta_page_size = PAGE_SIZE; + cpu_buffer->meta->nr_subbufs = cpu_buffer->nr_pages; + + /* The reader page is not part of the ring initially */ + ret = hyp_bpage_init(bpage, pdesc->page_va[0]); + if (ret) + goto err; + + cpu_buffer->nr_pages = 1; + + cpu_buffer->reader_page = bpage; + cpu_buffer->tail_page = bpage + 1; + cpu_buffer->head_page = bpage + 1; + + for (i = 1; i < pdesc->nr_page_va; i++) { + ret = hyp_bpage_init(++bpage, pdesc->page_va[i]); + if (ret) + goto err; + + bpage->list.next = &(bpage + 1)->list; + bpage->list.prev = &(bpage - 1)->list; + bpage->id = i; + + cpu_buffer->nr_pages = i + 1; + } + + /* Close the ring */ + bpage->list.next = &cpu_buffer->tail_page->list; + cpu_buffer->tail_page->list.prev = &bpage->list; + + /* The last init'ed page points to the head page */ + hyp_bpage_set_link_flag(bpage, HYP_BPAGE_LINK_HEAD); + + cpu_buffer->last_overrun = 0; + + return 0; + +err: + hyp_rb_teardown(cpu_buffer); + + return ret; +} + +static int hyp_setup_bpage_backing(struct hyp_trace_desc *desc) +{ + unsigned long start = kern_hyp_va(desc->backing.start); + size_t size = desc->backing.size; + int ret; + + if (hyp_buffer_pages_backing.size) + return -EBUSY; + + if (!PAGE_ALIGNED(start) || !PAGE_ALIGNED(size)) + return -EINVAL; + + ret = __pkvm_host_donate_hyp(hyp_virt_to_pfn((void *)start), size >> PAGE_SHIFT); + if (ret) + return ret; + + memset((void *)start, 0, size); + + hyp_buffer_pages_backing.start = start; + hyp_buffer_pages_backing.size = size; + + return 0; +} + +static void hyp_teardown_bpage_backing(void) +{ + unsigned long start = hyp_buffer_pages_backing.start; + size_t size = hyp_buffer_pages_backing.size; + + if (!size) + return; + + memset((void *)start, 0, size); + + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(start), size >> PAGE_SHIFT)); + + hyp_buffer_pages_backing.start = 0; + hyp_buffer_pages_backing.size = 0; +} + +int __pkvm_swap_reader_tracing(unsigned int cpu) +{ + int ret = 0; + + if (cpu >= hyp_nr_cpus) + return -EINVAL; + + hyp_spin_lock(&trace_rb_lock); + ret = hyp_rb_swap_reader(per_cpu_ptr(&trace_rb, cpu)); + hyp_spin_unlock(&trace_rb_lock); + + return ret; +} + +static void __pkvm_teardown_tracing_locked(void) +{ + int cpu; + + hyp_assert_lock_held(&trace_rb_lock); + + for (cpu = 0; cpu < hyp_nr_cpus; cpu++) { + struct hyp_rb_per_cpu *cpu_buffer = per_cpu_ptr(&trace_rb, cpu); + + hyp_rb_teardown(cpu_buffer); + } + + hyp_teardown_bpage_backing(); +} + +void __pkvm_teardown_tracing(void) +{ + hyp_spin_lock(&trace_rb_lock); + __pkvm_teardown_tracing_locked(); + hyp_spin_unlock(&trace_rb_lock); +} + +static bool rb_page_desc_fits_desc(struct rb_page_desc *pdesc, + unsigned long desc_end) +{ + unsigned long *end; + + /* Check we can at least read nr_pages */ + if ((unsigned long)&pdesc->nr_page_va >= desc_end) + return false; + + end = &pdesc->page_va[pdesc->nr_page_va]; + + return (unsigned long)end <= desc_end; +} + +int __pkvm_load_tracing(unsigned long desc_hva, size_t desc_size) +{ + struct hyp_trace_desc *desc = (struct hyp_trace_desc *)kern_hyp_va(desc_hva); + struct trace_page_desc *trace_pdesc = &desc->page_desc; + struct hyp_buffer_page *bpage_backing_start; + struct rb_page_desc *pdesc; + int ret, cpu; + + if (!desc_size || !PAGE_ALIGNED(desc_hva) || !PAGE_ALIGNED(desc_size)) + return -EINVAL; + + ret = __pkvm_host_donate_hyp(hyp_virt_to_pfn((void *)desc), + desc_size >> PAGE_SHIFT); + if (ret) + return ret; + + hyp_spin_lock(&trace_rb_lock); + + ret = hyp_setup_bpage_backing(desc); + if (ret) + goto err; + + bpage_backing_start = (struct hyp_buffer_page *)hyp_buffer_pages_backing.start; + + for_each_rb_page_desc(pdesc, cpu, trace_pdesc) { + struct hyp_rb_per_cpu *cpu_buffer; + int cpu; + + ret = -EINVAL; + if (!rb_page_desc_fits_desc(pdesc, desc_hva + desc_size)) + break; + + cpu = pdesc->cpu; + if (cpu >= hyp_nr_cpus) + break; + + cpu_buffer = per_cpu_ptr(&trace_rb, cpu); + + ret = hyp_rb_init(pdesc, bpage_backing_start, cpu_buffer); + if (ret) + break; + + bpage_backing_start += pdesc->nr_page_va; + } + +err: + if (ret) + __pkvm_teardown_tracing_locked(); + + hyp_spin_unlock(&trace_rb_lock); + + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn((void *)desc), + desc_size >> PAGE_SHIFT)); + return ret; +} + +int __pkvm_enable_tracing(bool enable) +{ + int cpu, ret = enable ? -EINVAL : 0; + + hyp_spin_lock(&trace_rb_lock); + for (cpu = 0; cpu < hyp_nr_cpus; cpu++) { + struct hyp_rb_per_cpu *cpu_buffer = per_cpu_ptr(&trace_rb, cpu); + + if (enable) { + if (!hyp_rb_enable_writing(cpu_buffer)) + ret = 0; + } else { + hyp_rb_disable_writing(cpu_buffer); + } + + } + hyp_spin_unlock(&trace_rb_lock); + + return ret; +} From patchwork Mon Feb 24 12:13:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Donnefort X-Patchwork-Id: 13988026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 03970C021BD for ; Mon, 24 Feb 2025 13:14:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Y5r7JZ17h9iJ4MtevDeRsCbqO74pw2KuubssE/ufEFg=; b=eBBiooZe1bjj9FZjSDii1Szejb nAlUahoRUpzoUCcF9NNlq6FNdnR9Hc2Q9MLI/GIY96baF3MKF6Cws8lX9KuHnlE0vnUnSfOmOXcRm IGgKlPB2S5Yc15/gN+B9qujwZn2JA+2qwsqi06GPK8WTiu0m5C7EB45vn8fMQHpQulJQnbaehZUEJ ZSQhQL88xZX2AnnJyLsPwzhjIWlBSKwnBXWKLKqctdIPVPHFOz9U3FIM+fwWspmPhAyzUrKtFZ8Qs gZCztqDxKWhWr3WyTb3M0mgln7cNThnxJCk5ZKZra6xj9hIW+7d7dh0HrfN2sGGTM66G8ZSFX6KFJ 15MipRyQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmYHQ-0000000Dn7L-43wy; Mon, 24 Feb 2025 13:13:52 +0000 Received: from mail-wr1-x44a.google.com ([2a00:1450:4864:20::44a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmXN4-0000000DeO3-0lrD for linux-arm-kernel@lists.infradead.org; Mon, 24 Feb 2025 12:15:39 +0000 Received: by mail-wr1-x44a.google.com with SMTP id ffacd0b85a97d-38f4cce15c8so2035669f8f.2 for ; Mon, 24 Feb 2025 04:15:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740399336; x=1741004136; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Y5r7JZ17h9iJ4MtevDeRsCbqO74pw2KuubssE/ufEFg=; b=jtIr3AY/g6miVYo4mNsnFuAVbJf4Fi3cvycGzplb7aI/FWX5yq3Cdqcx2R0aT0N1UP B7KTs58OGRVOrVl/PGvgMxqel+FQIdCUpQNf9DkjWzMBAM3NHZWhxcqbnbvHdqCaF9rf ZodPFp841veZBnfCQCheTg55Xsut4ZMDdYeIuorW1ZS5NygxAz8/XdUp5ohzMiYXXMw8 PEY9d0TqNK/K4Ck2XIXyY67slovDvEb6CKd44kEIiAnQtwQkeKExKZ1pKuR7gI3YbrM+ 1rnK2gEJe0ipIkEkLBt4zFwhq46CocqtGzrfU7LNADfxETxT0wcor9lqVDnLwvuuLJAa b+xA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740399336; x=1741004136; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Y5r7JZ17h9iJ4MtevDeRsCbqO74pw2KuubssE/ufEFg=; b=ieiopgr2vzIjCCkK1CAXyZHsk+efy6RHigOkUZj6HKkiK7trOOVolo4uFlflOlDJIN ERhRgzgMxP8G425A1TTYBXNXaB5QvnOhyESZX4Vs0VmwN1Xh8b4L3AHjAHSuZkIibHvM FG8m8kPT7FhQQVqp5IdrKaZYxA1Vgq2tdWWAdpMwaTl14eOpvmgiqwNElPXDgYne/tHv 2lPBYmvMAnu6tQRXcQb2plmiI3UJGBH+lwlfICScjJTicHSS1egUbMIHx/x4vAHkpcFo LKhf9sMS5RbF3fuFonY+93Jut2Lb9ccmCnaP24ZCEHqzBXopqdO296Be1L5QJYm16B9t j8pA== X-Forwarded-Encrypted: i=1; AJvYcCWrnhHrYMBCZnFgP+Y0yGVKAgAsDHKZznfh5o7XLM2srbrI1mHZPYpP3vHZNHOd9RIz2QGmfMjE4/rvQn8oCVKA@lists.infradead.org X-Gm-Message-State: AOJu0Yxdf+kCFJ0GeYP2nTNanffGYK+sjNc8RERiRdMRmfl3BpQjL/oq qF9oDs8nZNH7HuVvutJ5hbjffwsjDaZeXwbi3oUAKiBZ7TZTlMBn/G73/2lDHx+/BZySSKgEZNH VZ9epYXc+01uaJ2eBuQ== X-Google-Smtp-Source: AGHT+IFb5KO5iJZ6PVNaJ/kfS/IiB3KQnifK85uuiMzH0uleL64JEwbx7ixRmvJtngfcuMNYllXC0/UZqLjTK9Yp X-Received: from wmbg15.prod.google.com ([2002:a05:600c:a40f:b0:439:88bc:d27d]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:47a3:0:b0:38e:c2de:70d4 with SMTP id ffacd0b85a97d-38f6f0959bamr10933294f8f.42.1740399336253; Mon, 24 Feb 2025 04:15:36 -0800 (PST) Date: Mon, 24 Feb 2025 12:13:48 +0000 In-Reply-To: <20250224121353.98697-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20250224121353.98697-1-vdonnefort@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250224121353.98697-7-vdonnefort@google.com> Subject: [PATCH 06/11] KVM: arm64: Add hyp tracing to tracefs From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_041538_239560_3477F08F X-CRM114-Status: GOOD ( 22.17 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org When running with KVM protected mode, the hypervisor is able to generate events into tracefs compatible ring-buffers. Plug those ring-buffers to tracefs. The interface is found in hyp/ and contains the same hierarchy as any host instances easing the support by existing user-space tools. This currently doesn't provide any event support which will come later. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index 3cf7adb2b503..865971bb8905 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -29,6 +29,8 @@ kvm-$(CONFIG_HW_PERF_EVENTS) += pmu-emul.o pmu.o kvm-$(CONFIG_ARM64_PTR_AUTH) += pauth.o kvm-$(CONFIG_PTDUMP_STAGE2_DEBUGFS) += ptdump.o +kvm-$(CONFIG_TRACING) += hyp_trace.o + always-y := hyp_constants.h hyp-constants.s define rule_gen_hyp_constants diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index b8e55a441282..f3951d36b9c1 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -25,6 +25,7 @@ #define CREATE_TRACE_POINTS #include "trace_arm.h" +#include "hyp_trace.h" #include #include @@ -2319,6 +2320,9 @@ static int __init init_subsystems(void) kvm_register_perf_callbacks(NULL); + err = hyp_trace_init_tracefs(); + if (err) + kvm_err("Failed to initialize Hyp tracing\n"); out: if (err) hyp_cpu_pm_exit(); diff --git a/arch/arm64/kvm/hyp/hyp-constants.c b/arch/arm64/kvm/hyp/hyp-constants.c index b257a3b4bfc5..5c4a797a701f 100644 --- a/arch/arm64/kvm/hyp/hyp-constants.c +++ b/arch/arm64/kvm/hyp/hyp-constants.c @@ -3,11 +3,15 @@ #include #include #include +#include int main(void) { DEFINE(STRUCT_HYP_PAGE_SIZE, sizeof(struct hyp_page)); DEFINE(PKVM_HYP_VM_SIZE, sizeof(struct pkvm_hyp_vm)); DEFINE(PKVM_HYP_VCPU_SIZE, sizeof(struct pkvm_hyp_vcpu)); +#ifdef CONFIG_TRACING + DEFINE(STRUCT_HYP_BUFFER_PAGE_SIZE, sizeof(struct hyp_buffer_page)); +#endif return 0; } diff --git a/arch/arm64/kvm/hyp/nvhe/trace.c b/arch/arm64/kvm/hyp/nvhe/trace.c index 4611cef64566..2f1e5005c5d4 100644 --- a/arch/arm64/kvm/hyp/nvhe/trace.c +++ b/arch/arm64/kvm/hyp/nvhe/trace.c @@ -123,7 +123,7 @@ static int hyp_rb_swap_reader(struct hyp_rb_per_cpu *cpu_buffer) reader->list.prev = head->list.prev; /* The last page before the head */ - last = hyp_bpage_from_link(reader->list.next); + last = hyp_bpage_from_link(head->list.prev); /* The reader page points to the new header page */ hyp_bpage_set_link_flag(reader, HYP_BPAGE_LINK_HEAD); diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c new file mode 100644 index 000000000000..c08ae8c33052 --- /dev/null +++ b/arch/arm64/kvm/hyp_trace.c @@ -0,0 +1,666 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2025 Google LLC + * Author: Vincent Donnefort + */ + +#include +#include +#include + +#include +#include + +#include "hyp_constants.h" +#include "hyp_trace.h" + +#define RB_POLL_MS 100 + +#define TRACEFS_DIR "hypervisor" +#define TRACEFS_MODE_WRITE 0640 +#define TRACEFS_MODE_READ 0440 + +static struct hyp_trace_buffer { + struct hyp_trace_desc *desc; + struct ring_buffer_remote remote; + struct trace_buffer *trace_buffer; + size_t desc_size; + bool tracing_on; + int nr_readers; + struct mutex lock; +} hyp_trace_buffer = { + .lock = __MUTEX_INITIALIZER(hyp_trace_buffer.lock), +}; + +static size_t hyp_trace_buffer_size = 7 << 10; + +/* Number of pages the ring-buffer requires to accommodate for size */ +#define NR_PAGES(size) \ + ((PAGE_ALIGN(size) >> PAGE_SHIFT) + 1) + +static inline bool hyp_trace_buffer_loaded(struct hyp_trace_buffer *hyp_buffer) +{ + return !!hyp_buffer->trace_buffer; +} + +static inline bool hyp_trace_buffer_used(struct hyp_trace_buffer *hyp_buffer) +{ + return hyp_buffer->nr_readers || hyp_buffer->tracing_on || + !ring_buffer_empty(hyp_buffer->trace_buffer); +} + +static int +bpage_backing_alloc(struct hyp_buffer_pages_backing *bpage_backing, size_t size) +{ + size_t backing_size; + void *start; + + backing_size = PAGE_ALIGN(STRUCT_HYP_BUFFER_PAGE_SIZE * NR_PAGES(size) * + num_possible_cpus()); + + start = alloc_pages_exact(backing_size, GFP_KERNEL_ACCOUNT); + if (!start) + return -ENOMEM; + + bpage_backing->start = (unsigned long)start; + bpage_backing->size = backing_size; + + return 0; +} + +static void +bpage_backing_free(struct hyp_buffer_pages_backing *bpage_backing) +{ + free_pages_exact((void *)bpage_backing->start, bpage_backing->size); +} + +static int __get_reader_page(int cpu) +{ + return kvm_call_hyp_nvhe(__pkvm_swap_reader_tracing, cpu); +} + +static void hyp_trace_free_pages(struct hyp_trace_desc *desc) +{ + struct rb_page_desc *rb_desc; + int cpu, id; + + for_each_rb_page_desc(rb_desc, cpu, &desc->page_desc) { + free_page(rb_desc->meta_va); + for (id = 0; id < rb_desc->nr_page_va; id++) + free_page(rb_desc->page_va[id]); + } +} + +static int hyp_trace_alloc_pages(struct hyp_trace_desc *desc, size_t size) +{ + int err = 0, cpu, id, nr_pages = NR_PAGES(size); + struct trace_page_desc *trace_desc; + struct rb_page_desc *rb_desc; + + trace_desc = &desc->page_desc; + trace_desc->nr_cpus = 0; + trace_desc->struct_len = offsetof(struct trace_page_desc, __data); + + rb_desc = (struct rb_page_desc *)&trace_desc->__data[0]; + + for_each_possible_cpu(cpu) { + rb_desc->cpu = cpu; + rb_desc->nr_page_va = 0; + rb_desc->meta_va = (unsigned long)page_to_virt(alloc_page(GFP_KERNEL)); + if (!rb_desc->meta_va) { + err = -ENOMEM; + break; + } + for (id = 0; id < nr_pages; id++) { + rb_desc->page_va[id] = (unsigned long)page_to_virt(alloc_page(GFP_KERNEL)); + if (!rb_desc->page_va[id]) { + err = -ENOMEM; + break; + } + rb_desc->nr_page_va++; + } + trace_desc->nr_cpus++; + trace_desc->struct_len += offsetof(struct rb_page_desc, page_va); + trace_desc->struct_len += sizeof(rb_desc->page_va[0]) * rb_desc->nr_page_va; + rb_desc = __next_rb_page_desc(rb_desc); + } + + if (err) { + hyp_trace_free_pages(desc); + return err; + } + + return 0; +} + +static int __load_page(unsigned long va) +{ + return kvm_call_hyp_nvhe(__pkvm_host_share_hyp, virt_to_pfn((void *)va), 1); +} + +static void __teardown_page(unsigned long va) +{ + WARN_ON(kvm_call_hyp_nvhe(__pkvm_host_unshare_hyp, virt_to_pfn((void *)va), 1)); +} + +static void hyp_trace_teardown_pages(struct hyp_trace_desc *desc, + int last_cpu) +{ + struct rb_page_desc *rb_desc; + int cpu, id; + + for_each_rb_page_desc(rb_desc, cpu, &desc->page_desc) { + if (cpu > last_cpu) + break; + __teardown_page(rb_desc->meta_va); + for (id = 0; id < rb_desc->nr_page_va; id++) + __teardown_page(rb_desc->page_va[id]); + } +} + +static int hyp_trace_load_pages(struct hyp_trace_desc *desc) +{ + int last_loaded_cpu = 0, cpu, id, err = -EINVAL; + struct rb_page_desc *rb_desc; + + for_each_rb_page_desc(rb_desc, cpu, &desc->page_desc) { + err = __load_page(rb_desc->meta_va); + if (err) + break; + + for (id = 0; id < rb_desc->nr_page_va; id++) { + err = __load_page(rb_desc->page_va[id]); + if (err) + break; + } + + if (!err) + continue; + + for (id--; id >= 0; id--) + __teardown_page(rb_desc->page_va[id]); + + last_loaded_cpu = cpu - 1; + + break; + } + + if (!err) + return 0; + + hyp_trace_teardown_pages(desc, last_loaded_cpu); + + return err; +} + +static int hyp_trace_buffer_load(struct hyp_trace_buffer *hyp_buffer, size_t size) +{ + int ret, nr_pages = NR_PAGES(size); + struct rb_page_desc *rbdesc; + struct hyp_trace_desc *desc; + size_t desc_size; + + if (hyp_trace_buffer_loaded(hyp_buffer)) + return 0; + + desc_size = size_add(offsetof(struct hyp_trace_desc, page_desc), + offsetof(struct trace_page_desc, __data)); + desc_size = size_add(desc_size, + size_mul(num_possible_cpus(), + struct_size(rbdesc, page_va, nr_pages))); + if (desc_size == SIZE_MAX) + return -E2BIG; + + /* + * The hypervisor will unmap the descriptor from the host to protect the + * reading. Page granularity for the allocation ensures no other + * useful data will be unmapped. + */ + desc_size = PAGE_ALIGN(desc_size); + + desc = (struct hyp_trace_desc *)alloc_pages_exact(desc_size, GFP_KERNEL); + if (!desc) + return -ENOMEM; + + ret = hyp_trace_alloc_pages(desc, size); + if (ret) + goto err_free_desc; + + ret = bpage_backing_alloc(&desc->backing, size); + if (ret) + goto err_free_pages; + + ret = hyp_trace_load_pages(desc); + if (ret) + goto err_free_backing; + + ret = kvm_call_hyp_nvhe(__pkvm_load_tracing, (unsigned long)desc, desc_size); + if (ret) + goto err_teardown_pages; + + hyp_buffer->remote.pdesc = &desc->page_desc; + hyp_buffer->remote.get_reader_page = __get_reader_page; + hyp_buffer->trace_buffer = ring_buffer_remote(&hyp_buffer->remote); + if (!hyp_buffer->trace_buffer) { + ret = -ENOMEM; + goto err_teardown_tracing; + } + + hyp_buffer->desc = desc; + hyp_buffer->desc_size = desc_size; + + return 0; + +err_teardown_tracing: + kvm_call_hyp_nvhe(__pkvm_teardown_tracing); +err_teardown_pages: + hyp_trace_teardown_pages(desc, INT_MAX); +err_free_backing: + bpage_backing_free(&desc->backing); +err_free_pages: + hyp_trace_free_pages(desc); +err_free_desc: + free_pages_exact(desc, desc_size); + + return ret; +} + +static void hyp_trace_buffer_teardown(struct hyp_trace_buffer *hyp_buffer) +{ + struct hyp_trace_desc *desc = hyp_buffer->desc; + size_t desc_size = hyp_buffer->desc_size; + + if (!hyp_trace_buffer_loaded(hyp_buffer)) + return; + + if (hyp_trace_buffer_used(hyp_buffer)) + return; + + if (kvm_call_hyp_nvhe(__pkvm_teardown_tracing)) + return; + + ring_buffer_free(hyp_buffer->trace_buffer); + hyp_trace_teardown_pages(desc, INT_MAX); + bpage_backing_free(&desc->backing); + hyp_trace_free_pages(desc); + free_pages_exact(desc, desc_size); + hyp_buffer->trace_buffer = NULL; +} + +static int hyp_trace_start(void) +{ + struct hyp_trace_buffer *hyp_buffer = &hyp_trace_buffer; + int ret = 0; + + mutex_lock(&hyp_buffer->lock); + + if (hyp_buffer->tracing_on) + goto out; + + ret = hyp_trace_buffer_load(hyp_buffer, hyp_trace_buffer_size); + if (ret) + goto out; + + ret = kvm_call_hyp_nvhe(__pkvm_enable_tracing, true); + if (ret) { + hyp_trace_buffer_teardown(hyp_buffer); + goto out; + } + + hyp_buffer->tracing_on = true; + +out: + mutex_unlock(&hyp_buffer->lock); + + return ret; +} + +static void hyp_trace_stop(void) +{ + struct hyp_trace_buffer *hyp_buffer = &hyp_trace_buffer; + int ret; + + mutex_lock(&hyp_buffer->lock); + + if (!hyp_buffer->tracing_on) + goto end; + + ret = kvm_call_hyp_nvhe(__pkvm_enable_tracing, false); + if (!ret) { + ring_buffer_poll_remote(hyp_buffer->trace_buffer, + RING_BUFFER_ALL_CPUS); + hyp_buffer->tracing_on = false; + hyp_trace_buffer_teardown(hyp_buffer); + } + +end: + mutex_unlock(&hyp_buffer->lock); +} + +static ssize_t hyp_tracing_on(struct file *filp, const char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + unsigned long val; + int ret; + + ret = kstrtoul_from_user(ubuf, cnt, 10, &val); + if (ret) + return ret; + + if (val) + ret = hyp_trace_start(); + else + hyp_trace_stop(); + + return ret ? ret : cnt; +} + +static ssize_t hyp_tracing_on_read(struct file *filp, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + char buf[3]; + int r; + + mutex_lock(&hyp_trace_buffer.lock); + r = sprintf(buf, "%d\n", hyp_trace_buffer.tracing_on); + mutex_unlock(&hyp_trace_buffer.lock); + + return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); +} + +static const struct file_operations hyp_tracing_on_fops = { + .write = hyp_tracing_on, + .read = hyp_tracing_on_read, +}; + +static ssize_t hyp_buffer_size(struct file *filp, const char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + unsigned long val; + int ret; + + ret = kstrtoul_from_user(ubuf, cnt, 10, &val); + if (ret) + return ret; + + if (!val) + return -EINVAL; + + mutex_lock(&hyp_trace_buffer.lock); + hyp_trace_buffer_size = val << 10; /* KB to B */ + mutex_unlock(&hyp_trace_buffer.lock); + + return cnt; +} + +static ssize_t hyp_buffer_size_read(struct file *filp, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + char buf[64]; + int r; + + mutex_lock(&hyp_trace_buffer.lock); + r = sprintf(buf, "%lu (%s)\n", hyp_trace_buffer_size >> 10, + hyp_trace_buffer_loaded(&hyp_trace_buffer) ? + "loaded" : "unloaded"); + mutex_unlock(&hyp_trace_buffer.lock); + + return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); +} + +static const struct file_operations hyp_buffer_size_fops = { + .write = hyp_buffer_size, + .read = hyp_buffer_size_read, +}; + +static void ht_print_trace_time(struct ht_iterator *iter) +{ + unsigned long usecs_rem; + u64 ts_ns = iter->ts; + + do_div(ts_ns, 1000); + usecs_rem = do_div(ts_ns, USEC_PER_SEC); + + trace_seq_printf(&iter->seq, "%5lu.%06lu: ", + (unsigned long)ts_ns, usecs_rem); +} + +static void ht_print_trace_cpu(struct ht_iterator *iter) +{ + trace_seq_printf(&iter->seq, "[%03d]\t", iter->ent_cpu); +} + +static int ht_print_trace_fmt(struct ht_iterator *iter) +{ + if (iter->lost_events) + trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n", + iter->ent_cpu, iter->lost_events); + + ht_print_trace_cpu(iter); + ht_print_trace_time(iter); + + return trace_seq_has_overflowed(&iter->seq) ? -EOVERFLOW : 0; +}; + +static struct ring_buffer_event *__ht_next_pipe_event(struct ht_iterator *iter) +{ + struct ring_buffer_event *evt = NULL; + int cpu = iter->cpu; + + if (cpu != RING_BUFFER_ALL_CPUS) { + if (ring_buffer_empty_cpu(iter->trace_buffer, cpu)) + return NULL; + + iter->ent_cpu = cpu; + + return ring_buffer_peek(iter->trace_buffer, cpu, &iter->ts, + &iter->lost_events); + } + + iter->ts = LLONG_MAX; + for_each_possible_cpu(cpu) { + struct ring_buffer_event *_evt; + unsigned long lost_events; + u64 ts; + + if (ring_buffer_empty_cpu(iter->trace_buffer, cpu)) + continue; + + _evt = ring_buffer_peek(iter->trace_buffer, cpu, &ts, + &lost_events); + if (!_evt) + continue; + + if (ts >= iter->ts) + continue; + + iter->ts = ts; + iter->ent_cpu = cpu; + iter->lost_events = lost_events; + evt = _evt; + } + + return evt; +} + +static void *ht_next_pipe_event(struct ht_iterator *iter) +{ + struct ring_buffer_event *event; + + event = __ht_next_pipe_event(iter); + if (!event) + return NULL; + + iter->ent = (struct hyp_entry_hdr *)&event->array[1]; + iter->ent_size = event->array[0]; + + return iter; +} + +static ssize_t +hyp_trace_pipe_read(struct file *file, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + struct ht_iterator *iter = (struct ht_iterator *)file->private_data; + int ret; + +copy_to_user: + ret = trace_seq_to_user(&iter->seq, ubuf, cnt); + if (ret != -EBUSY) + return ret; + + trace_seq_init(&iter->seq); + + ret = ring_buffer_wait(iter->trace_buffer, iter->cpu, 0, NULL, NULL); + if (ret < 0) + return ret; + + while (ht_next_pipe_event(iter)) { + int prev_len = iter->seq.seq.len; + + if (ht_print_trace_fmt(iter)) { + iter->seq.seq.len = prev_len; + break; + } + + ring_buffer_consume(iter->trace_buffer, iter->ent_cpu, NULL, + NULL); + } + + goto copy_to_user; +} + +static void __poll_remote(struct work_struct *work) +{ + struct delayed_work *dwork = to_delayed_work(work); + struct ht_iterator *iter; + + iter = container_of(dwork, struct ht_iterator, poll_work); + + ring_buffer_poll_remote(iter->trace_buffer, iter->cpu); + + schedule_delayed_work((struct delayed_work *)work, + msecs_to_jiffies(RB_POLL_MS)); +} + +static int hyp_trace_pipe_open(struct inode *inode, struct file *file) +{ + struct hyp_trace_buffer *hyp_buffer = &hyp_trace_buffer; + int cpu = (s64)inode->i_private; + struct ht_iterator *iter = NULL; + int ret; + + mutex_lock(&hyp_buffer->lock); + + if (hyp_buffer->nr_readers == INT_MAX) { + ret = -EBUSY; + goto unlock; + } + + ret = hyp_trace_buffer_load(hyp_buffer, hyp_trace_buffer_size); + if (ret) + goto unlock; + + iter = kzalloc(sizeof(*iter), GFP_KERNEL); + if (!iter) { + ret = -ENOMEM; + goto unlock; + } + iter->trace_buffer = hyp_buffer->trace_buffer; + iter->cpu = cpu; + trace_seq_init(&iter->seq); + file->private_data = iter; + + ret = ring_buffer_poll_remote(hyp_buffer->trace_buffer, cpu); + if (ret) + goto unlock; + + INIT_DELAYED_WORK(&iter->poll_work, __poll_remote); + schedule_delayed_work(&iter->poll_work, msecs_to_jiffies(RB_POLL_MS)); + + hyp_buffer->nr_readers++; + +unlock: + if (ret) { + hyp_trace_buffer_teardown(hyp_buffer); + kfree(iter); + } + + mutex_unlock(&hyp_buffer->lock); + + return ret; +} + +static int hyp_trace_pipe_release(struct inode *inode, struct file *file) +{ + struct hyp_trace_buffer *hyp_buffer = &hyp_trace_buffer; + struct ht_iterator *iter = file->private_data; + + cancel_delayed_work_sync(&iter->poll_work); + + mutex_lock(&hyp_buffer->lock); + + WARN_ON(--hyp_buffer->nr_readers < 0); + + hyp_trace_buffer_teardown(hyp_buffer); + + mutex_unlock(&hyp_buffer->lock); + + kfree(iter); + + return 0; +} + +static const struct file_operations hyp_trace_pipe_fops = { + .open = hyp_trace_pipe_open, + .read = hyp_trace_pipe_read, + .release = hyp_trace_pipe_release, +}; + +int hyp_trace_init_tracefs(void) +{ + struct dentry *root, *per_cpu_root; + char per_cpu_name[16]; + long cpu; + + if (!is_protected_kvm_enabled()) + return 0; + + root = tracefs_create_dir(TRACEFS_DIR, NULL); + if (!root) { + pr_err("Failed to create tracefs "TRACEFS_DIR"/\n"); + return -ENODEV; + } + + tracefs_create_file("tracing_on", TRACEFS_MODE_WRITE, root, NULL, + &hyp_tracing_on_fops); + + tracefs_create_file("buffer_size_kb", TRACEFS_MODE_WRITE, root, NULL, + &hyp_buffer_size_fops); + + tracefs_create_file("trace_pipe", TRACEFS_MODE_WRITE, root, + (void *)RING_BUFFER_ALL_CPUS, &hyp_trace_pipe_fops); + + per_cpu_root = tracefs_create_dir("per_cpu", root); + if (!per_cpu_root) { + pr_err("Failed to create tracefs folder "TRACEFS_DIR"/per_cpu/\n"); + return -ENODEV; + } + + for_each_possible_cpu(cpu) { + struct dentry *per_cpu_dir; + + snprintf(per_cpu_name, sizeof(per_cpu_name), "cpu%ld", cpu); + per_cpu_dir = tracefs_create_dir(per_cpu_name, per_cpu_root); + if (!per_cpu_dir) { + pr_warn("Failed to create tracefs "TRACEFS_DIR"/per_cpu/cpu%ld\n", + cpu); + continue; + } + + tracefs_create_file("trace_pipe", TRACEFS_MODE_READ, per_cpu_dir, + (void *)cpu, &hyp_trace_pipe_fops); + } + + return 0; +} diff --git a/arch/arm64/kvm/hyp_trace.h b/arch/arm64/kvm/hyp_trace.h new file mode 100644 index 000000000000..14fc06c625a6 --- /dev/null +++ b/arch/arm64/kvm/hyp_trace.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __ARM64_KVM_HYP_TRACE_H__ +#define __ARM64_KVM_HYP_TRACE_H__ + +#include +#include + +struct ht_iterator { + struct trace_buffer *trace_buffer; + int cpu; + struct hyp_entry_hdr *ent; + unsigned long lost_events; + int ent_cpu; + size_t ent_size; + u64 ts; + void *spare; + size_t copy_leftover; + struct trace_seq seq; + struct delayed_work poll_work; +}; + +#ifdef CONFIG_TRACING +int hyp_trace_init_tracefs(void); +#else +static inline int hyp_trace_init_tracefs(void) { return 0; } +#endif +#endif From patchwork Mon Feb 24 12:13:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Donnefort X-Patchwork-Id: 13988025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A264BC021BC for ; Mon, 24 Feb 2025 13:14:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=hVhw57Jufj46SmHUfZ4ri6xoHA834Iimt50yuLFlrYc=; b=AXFvkAmeQlHg5yEt3qQrAgFFoc MlmiKfHJbPO6kxpoz317ptXhD3IxQKXkhBaxz0+A8ZEzrNDPGRQjYA3IbPA7rXjHp7LMJ3IL6l7h+ brXNuJI2aLZTFXq6HNzTVZNgJeg3AlDUEriZtH0e7YzE0+CaQMagCxWsWeZjSWpECNCvg2U6Ukt/7 11CGGndSEKQn83fhx+yZrGbixsM4t8kM8Dcz3ecKU6pedUC6coKhOUX6w1YNZn2Ofa0uw1WmKmPzH o/o8+MuQtHrwMUo8Js3XP63Pn5BtjFa6CnlaLmEXS8ji8Gv0LSE8p3UweNvS6fuiSK0ZO76jWHrXB 6JLJogRw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmYHR-0000000Dn7k-2aay; Mon, 24 Feb 2025 13:13:53 +0000 Received: from mail-wm1-x349.google.com ([2a00:1450:4864:20::349]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmXN6-0000000DePf-0JgB for linux-arm-kernel@lists.infradead.org; Mon, 24 Feb 2025 12:15:41 +0000 Received: by mail-wm1-x349.google.com with SMTP id 5b1f17b1804b1-43988b9ecfbso20969875e9.0 for ; Mon, 24 Feb 2025 04:15:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740399338; x=1741004138; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=hVhw57Jufj46SmHUfZ4ri6xoHA834Iimt50yuLFlrYc=; b=mWHK94xDUn5PrLgF07t77n+HX3v1aWWWBiaJAG3wBY30vcaVt4ifphAkaF3HQqrOqh O5es+3ifc8kHTCu/Se1uY3wxMKWL2D/5AN25VgEvyC9961XuLZfnG+FJIscwu+Q33fpb 5w3B9ibJSr/SIW2rp7PY5kB9g5KKS578JCWPdCUWfBc08WlOWDRMze+6HTHpYsp49FAJ 2VH7l2ZvcZw9NTB2hFEZvraciE8dLukQKll8aGpzqDmkZB1bco9iuiJwQ/dNSKK8RNLv oiHhKlb1oVLEZO8AzfZSL/gYglSkCqGbvd22P5PWeQHJCWwzYb8k7YhlgaMB2qJMtVTI 7GDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740399338; x=1741004138; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hVhw57Jufj46SmHUfZ4ri6xoHA834Iimt50yuLFlrYc=; b=sUgKHmTMby5+NGuiqaNuTL3b28yuFpFk/W7rBfHe44ePqAM3uQPKTVENSWO8ZaSO/v +cSEpd0R26D1pY6OQ9oRVMeWD4UHykCEKMXZ/ymsGO/LlnKU4AZnizVKto8Zjx/dH6av OHBQbQRw9rSFwkaVi/KEFDRI7ISVeAd4Ap5RGNbx8CxzM6UQsZGecbF3pPMbxOoxF+u/ X146QCOp25bRL4igU93Y+0ohiYZXkgaVyA5/j+5r3dliwPy82ghw9w0/xsPCf4MjQliH HXc6usWb9E52f3c1E9F40hceq+nzsBWA3gR66GiDjE+DS9JPWvhseeZ5ykE3D5d/S2FW RAEg== X-Forwarded-Encrypted: i=1; AJvYcCWclkS10e5aw/uIVNJHUNBVCswgGa4fNZNw7BJQEx7oPTJVaXncJ0cefdaVAyVI9HRch958rNERn65v8Qs9BcDq@lists.infradead.org X-Gm-Message-State: AOJu0YyqbJUe9EkjRkpbSQW34FLMUDyHjWEPYf6ewJzvkYopcHGyCJN3 s8ojf70SCiVRlDEUTMYP7qZGnXqRC1PMisESJ0o7WT2F6SOrCWXYMjOaiAy/1uPV/XLwDC+5zit 4slEuClgvGEhr1SzywQ== X-Google-Smtp-Source: AGHT+IGYvfeclzLg1LecYWKaujGFv2FozAKdY2gpg5OU/072a63VCbslAp2aXfAdmzerhZC+G3TLjg+9qcYgAwuQ X-Received: from wmqd6.prod.google.com ([2002:a05:600c:34c6:b0:439:98eb:28cd]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4e8c:b0:439:932c:e0d9 with SMTP id 5b1f17b1804b1-439ae1e62efmr118579125e9.10.1740399338377; Mon, 24 Feb 2025 04:15:38 -0800 (PST) Date: Mon, 24 Feb 2025 12:13:49 +0000 In-Reply-To: <20250224121353.98697-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20250224121353.98697-1-vdonnefort@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250224121353.98697-8-vdonnefort@google.com> Subject: [PATCH 07/11] KVM: arm64: Add clock for hyp tracefs From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort , Thomas Gleixner , Stephen Boyd , "Christopher S. Hall" , Richard Cochran X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_041540_117830_5B32BFE9 X-CRM114-Status: GOOD ( 24.82 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Configure the hypervisor tracing clock before starting tracing. For tracing purpose, the boot clock is interesting as it doesn't stop on suspend. However, it is corrected on a regular basis, which implies we need to re-evaluate it every once in a while. Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Christopher S. Hall Cc: Richard Cochran Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index b5893e0afe8e..87d3e0e73b68 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -87,6 +87,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_load, __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_put, __KVM_HOST_SMCCC_FUNC___pkvm_tlb_flush_vmid, + __KVM_HOST_SMCCC_FUNC___pkvm_update_clock_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_load_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_teardown_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_enable_tracing, diff --git a/arch/arm64/kvm/hyp/include/nvhe/trace.h b/arch/arm64/kvm/hyp/include/nvhe/trace.h index bf74a6ee322d..6f1cc571b47a 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/trace.h +++ b/arch/arm64/kvm/hyp/include/nvhe/trace.h @@ -16,6 +16,7 @@ struct hyp_buffer_page { void *tracing_reserve_entry(unsigned long length); void tracing_commit_entry(void); +void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc); int __pkvm_load_tracing(unsigned long desc_va, size_t desc_size); void __pkvm_teardown_tracing(void); int __pkvm_enable_tracing(bool enable); @@ -24,6 +25,8 @@ int __pkvm_swap_reader_tracing(unsigned int cpu); static inline void *tracing_reserve_entry(unsigned long length) { return NULL; } static inline void tracing_commit_entry(void) { } +static inline +void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc) { } static inline int __pkvm_load_tracing(unsigned long desc_va, size_t desc_size) { return -ENODEV; } static inline void __pkvm_teardown_tracing(void) { } static inline int __pkvm_enable_tracing(bool enable) { return -ENODEV; } diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index ced0a161d56e..a8b497b22407 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -571,6 +571,18 @@ static void handle___pkvm_teardown_vm(struct kvm_cpu_context *host_ctxt) cpu_reg(host_ctxt, 1) = __pkvm_teardown_vm(handle); } +static void handle___pkvm_update_clock_tracing(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(u32, mult, host_ctxt, 1); + DECLARE_REG(u32, shift, host_ctxt, 2); + DECLARE_REG(u64, epoch_ns, host_ctxt, 3); + DECLARE_REG(u64, epoch_cyc, host_ctxt, 4); + + __pkvm_update_clock_tracing(mult, shift, epoch_ns, epoch_cyc); + + cpu_reg(host_ctxt, 1) = 0; +} + static void handle___pkvm_load_tracing(struct kvm_cpu_context *host_ctxt) { DECLARE_REG(unsigned long, desc_hva, host_ctxt, 1); @@ -639,6 +651,7 @@ static const hcall_t host_hcall[] = { HANDLE_FUNC(__pkvm_vcpu_load), HANDLE_FUNC(__pkvm_vcpu_put), HANDLE_FUNC(__pkvm_tlb_flush_vmid), + HANDLE_FUNC(__pkvm_update_clock_tracing), HANDLE_FUNC(__pkvm_load_tracing), HANDLE_FUNC(__pkvm_teardown_tracing), HANDLE_FUNC(__pkvm_enable_tracing), diff --git a/arch/arm64/kvm/hyp/nvhe/trace.c b/arch/arm64/kvm/hyp/nvhe/trace.c index 2f1e5005c5d4..d79b6539377e 100644 --- a/arch/arm64/kvm/hyp/nvhe/trace.c +++ b/arch/arm64/kvm/hyp/nvhe/trace.c @@ -430,6 +430,22 @@ static void hyp_teardown_bpage_backing(void) hyp_buffer_pages_backing.size = 0; } +void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc) +{ + int cpu; + + /* After this loop, all CPUs are observing the new bank... */ + for (cpu = 0; cpu < hyp_nr_cpus; cpu++) { + struct hyp_rb_per_cpu *cpu_buffer = per_cpu_ptr(&trace_rb, cpu); + + while (READ_ONCE(cpu_buffer->status) == HYP_RB_WRITING) + ; + } + + /* ...we can now override the old one and swap. */ + trace_clock_update(mult, shift, epoch_ns, epoch_cyc); +} + int __pkvm_swap_reader_tracing(unsigned int cpu) { int ret = 0; diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c index c08ae8c33052..3f91ad69c25b 100644 --- a/arch/arm64/kvm/hyp_trace.c +++ b/arch/arm64/kvm/hyp_trace.c @@ -16,10 +16,33 @@ #define RB_POLL_MS 100 +/* Same 10min used by clocksource when width is more than 32-bits */ +#define CLOCK_MAX_CONVERSION_S 600 +/* + * Time to give for the clock init. Long enough to get a good mult/shift + * estimation. Short enough to not delay the tracing start too much. + */ +#define CLOCK_INIT_MS 100 +/* + * Time between clock checks. Must be small enough to catch clock deviation when + * it is still tiny. + */ +#define CLOCK_UPDATE_MS 500 + #define TRACEFS_DIR "hypervisor" #define TRACEFS_MODE_WRITE 0640 #define TRACEFS_MODE_READ 0440 +struct hyp_trace_clock { + u64 cycles; + u64 cyc_overflow64; + u64 boot; + u32 mult; + u32 shift; + struct delayed_work work; + struct completion ready; +}; + static struct hyp_trace_buffer { struct hyp_trace_desc *desc; struct ring_buffer_remote remote; @@ -28,6 +51,7 @@ static struct hyp_trace_buffer { bool tracing_on; int nr_readers; struct mutex lock; + struct hyp_trace_clock clock; } hyp_trace_buffer = { .lock = __MUTEX_INITIALIZER(hyp_trace_buffer.lock), }; @@ -74,6 +98,103 @@ bpage_backing_free(struct hyp_buffer_pages_backing *bpage_backing) free_pages_exact((void *)bpage_backing->start, bpage_backing->size); } +static void __hyp_clock_work(struct work_struct *work) +{ + struct delayed_work *dwork = to_delayed_work(work); + struct hyp_trace_buffer *hyp_buffer; + struct hyp_trace_clock *hyp_clock; + struct system_time_snapshot snap; + u64 rate, delta_cycles; + u64 boot, delta_boot; + + hyp_clock = container_of(dwork, struct hyp_trace_clock, work); + hyp_buffer = container_of(hyp_clock, struct hyp_trace_buffer, clock); + + ktime_get_snapshot(&snap); + boot = ktime_to_ns(snap.boot); + + delta_boot = boot - hyp_clock->boot; + delta_cycles = snap.cycles - hyp_clock->cycles; + + /* Compare hyp clock with the kernel boot clock */ + if (hyp_clock->mult) { + u64 err, cur = delta_cycles; + + if (WARN_ON_ONCE(cur >= hyp_clock->cyc_overflow64)) { + __uint128_t tmp = (__uint128_t)cur * hyp_clock->mult; + + cur = tmp >> hyp_clock->shift; + } else { + cur *= hyp_clock->mult; + cur >>= hyp_clock->shift; + } + cur += hyp_clock->boot; + + err = abs_diff(cur, boot); + /* No deviation, only update epoch if necessary */ + if (!err) { + if (delta_cycles >= (hyp_clock->cyc_overflow64 >> 1)) + goto fast_forward; + + goto resched; + } + + /* Warn if the error is above tracing precision (1us) */ + if (hyp_buffer->tracing_on && err > NSEC_PER_USEC) + pr_warn_ratelimited("hyp trace clock off by %lluus\n", + err / NSEC_PER_USEC); + } + + rate = div64_u64(delta_cycles * NSEC_PER_SEC, delta_boot); + + clocks_calc_mult_shift(&hyp_clock->mult, &hyp_clock->shift, + rate, NSEC_PER_SEC, CLOCK_MAX_CONVERSION_S); + + /* Add a comfortable 50% margin */ + hyp_clock->cyc_overflow64 = (U64_MAX / hyp_clock->mult) >> 1; + +fast_forward: + hyp_clock->cycles = snap.cycles; + hyp_clock->boot = boot; + kvm_call_hyp_nvhe(__pkvm_update_clock_tracing, hyp_clock->mult, + hyp_clock->shift, hyp_clock->boot, hyp_clock->cycles); + complete(&hyp_clock->ready); + +resched: + schedule_delayed_work(&hyp_clock->work, + msecs_to_jiffies(CLOCK_UPDATE_MS)); +} + +static void hyp_clock_start(struct hyp_trace_buffer *hyp_buffer) +{ + struct hyp_trace_clock *hyp_clock = &hyp_buffer->clock; + struct system_time_snapshot snap; + + ktime_get_snapshot(&snap); + + hyp_clock->boot = ktime_to_ns(snap.boot); + hyp_clock->cycles = snap.cycles; + hyp_clock->mult = 0; + + init_completion(&hyp_clock->ready); + INIT_DELAYED_WORK(&hyp_clock->work, __hyp_clock_work); + schedule_delayed_work(&hyp_clock->work, msecs_to_jiffies(CLOCK_INIT_MS)); +} + +static void hyp_clock_stop(struct hyp_trace_buffer *hyp_buffer) +{ + struct hyp_trace_clock *hyp_clock = &hyp_buffer->clock; + + cancel_delayed_work_sync(&hyp_clock->work); +} + +static void hyp_clock_wait(struct hyp_trace_buffer *hyp_buffer) +{ + struct hyp_trace_clock *hyp_clock = &hyp_buffer->clock; + + wait_for_completion(&hyp_clock->ready); +} + static int __get_reader_page(int cpu) { return kvm_call_hyp_nvhe(__pkvm_swap_reader_tracing, cpu); @@ -297,10 +418,14 @@ static int hyp_trace_start(void) if (hyp_buffer->tracing_on) goto out; + hyp_clock_start(hyp_buffer); + ret = hyp_trace_buffer_load(hyp_buffer, hyp_trace_buffer_size); if (ret) goto out; + hyp_clock_wait(hyp_buffer); + ret = kvm_call_hyp_nvhe(__pkvm_enable_tracing, true); if (ret) { hyp_trace_buffer_teardown(hyp_buffer); @@ -310,6 +435,9 @@ static int hyp_trace_start(void) hyp_buffer->tracing_on = true; out: + if (!hyp_buffer->tracing_on) + hyp_clock_stop(hyp_buffer); + mutex_unlock(&hyp_buffer->lock); return ret; @@ -329,6 +457,7 @@ static void hyp_trace_stop(void) if (!ret) { ring_buffer_poll_remote(hyp_buffer->trace_buffer, RING_BUFFER_ALL_CPUS); + hyp_clock_stop(hyp_buffer); hyp_buffer->tracing_on = false; hyp_trace_buffer_teardown(hyp_buffer); } @@ -617,6 +746,14 @@ static const struct file_operations hyp_trace_pipe_fops = { .release = hyp_trace_pipe_release, }; +static int hyp_trace_clock_show(struct seq_file *m, void *v) +{ + seq_puts(m, "[boot]\n"); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(hyp_trace_clock); + int hyp_trace_init_tracefs(void) { struct dentry *root, *per_cpu_root; @@ -641,6 +778,9 @@ int hyp_trace_init_tracefs(void) tracefs_create_file("trace_pipe", TRACEFS_MODE_WRITE, root, (void *)RING_BUFFER_ALL_CPUS, &hyp_trace_pipe_fops); + tracefs_create_file("trace_clock", TRACEFS_MODE_READ, root, NULL, + &hyp_trace_clock_fops); + per_cpu_root = tracefs_create_dir("per_cpu", root); if (!per_cpu_root) { pr_err("Failed to create tracefs folder "TRACEFS_DIR"/per_cpu/\n"); From patchwork Mon Feb 24 12:13:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Donnefort X-Patchwork-Id: 13988023 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 55515C021A6 for ; Mon, 24 Feb 2025 13:14:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=DAhUrVqLivzp9jwXFDW1HJkTiMmByFJAoDrNGn0+mK0=; b=OyIUMZ0iuhD6GDo5VyVRX9TssJ 9bCm29947mdqbiNNq7wXspl5T3+jzTlWwlCMgzI3kjsbhwuzmpNyYH+ZcKdT6ixnBez3ErUYTN4uJ vUG2eM55wOogOUtz/f72egfvENCLBElUXjvvT6HlVWRWLXNuG2JJOnMTeid7uBVoMmczrC/itcHdr UgdcVYd9I1nz+J1BoQS7dLgNfQ83k5uGTxr4gy51RI3e/Qse7ShyFVMeS499Sfb2bFbvVQLP+Bati yVDKNXeF4cuehf2L/sgTGc4Xa2DTVzd6O8EQEoldLPu7ByXa8j7E0uYDlWMwWyG8DhDhwoDgSDw1N 4sMsYp5Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmYHS-0000000Dn9z-1kS8; Mon, 24 Feb 2025 13:13:54 +0000 Received: from mail-wm1-x349.google.com ([2a00:1450:4864:20::349]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmXN8-0000000DeRI-1Q3S for linux-arm-kernel@lists.infradead.org; Mon, 24 Feb 2025 12:15:43 +0000 Received: by mail-wm1-x349.google.com with SMTP id 5b1f17b1804b1-43947979ce8so18296445e9.0 for ; Mon, 24 Feb 2025 04:15:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740399340; x=1741004140; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=DAhUrVqLivzp9jwXFDW1HJkTiMmByFJAoDrNGn0+mK0=; b=j7px3+H2Z6OJTeSBoV+BsQT4fqleeMmHrx8tsdQk+DW55fHGoJekxFNsxtMG09F2Jr X0K6CcTfVojbee9upa17xhwb/kHexBvb7OhuPhFuVCOsGnJR7vGFgS9kq9sLmfxCRR0d HUeMmy1Ph5GeB5UtXLfO9neUmKaFk5L0c0CMRy46NqtPslj8WomSF59hCWdaYfczprp+ 7AGGX3EKgDy7rbnA0ZeZkQug/EgMmmyMo+lma7beQWzD7o4mwmI1IF6TWSrdTFS39dRZ 33OvDwpgz9MJJF1M05pv7hwnegl6afIFiSopCVzaxFgfc+tFMtLdS8ygnaivodIXdzTM hA5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740399340; x=1741004140; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DAhUrVqLivzp9jwXFDW1HJkTiMmByFJAoDrNGn0+mK0=; b=MwaLlX7VKcjkMuWWBywJ2qkEP96W+DoCfXEeys8WqH2CgUF62vzF9GwYj8JFtpYnxV PM09tC5v4+tih+csHnjp8abICA2r4idkWsf0Q3zoXcN+2JQH/sh3uHvyEJ+b7U0Rrd7A +5YFSgQ12XYCH8yEvpsaJWPX+K1iceTtpHbKBtikKM4rW7d8X+9YYw4oTyqp3CKgQP/f loVu9CwkXsP4o0l1H62OWfBOg1xwmAAiWCASg/unFWlCm4eW0Vpboq9SgePnEVM90ju6 QIRthG9kjpECtwTkOidpwySSgdAI62kiP+Hgap88XB9A3yJHKg7zTEhSd0RTnJbnai+X y7Wg== X-Forwarded-Encrypted: i=1; AJvYcCVvr67HhCIoTXEeg+zVANK0XPrd6rss+F0pyNzJZ7YS+7EoBvjcFYQh5tBWvNrQeBRzHYAWkNpLCp3e0VafTd+L@lists.infradead.org X-Gm-Message-State: AOJu0Yx/ad4JB6l2YJnYVvnhy/8oI4bdnNc727mFMqAUtT/h1aqH7rx9 egvGza7703P6SQqnvBb6V5lcwKXLUAEqEvsfZrLjfpK3acNs0KbPCwEBSCzqEkHBBO+n7mqmk8l 8iSZjJcYpVJraV+DKQA== X-Google-Smtp-Source: AGHT+IHQGpayqqEkTmw/Zvqr6LdF1E/ueJfO1lglmZUQzjDwUDQWHOMK/KK1dg4emFnGMwfhnTYT5JRRxbk4ic04 X-Received: from wmbhc15.prod.google.com ([2002:a05:600c:870f:b0:439:858e:8ce6]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1e23:b0:439:987c:2309 with SMTP id 5b1f17b1804b1-439ae221d7dmr101374675e9.27.1740399340381; Mon, 24 Feb 2025 04:15:40 -0800 (PST) Date: Mon, 24 Feb 2025 12:13:50 +0000 In-Reply-To: <20250224121353.98697-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20250224121353.98697-1-vdonnefort@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250224121353.98697-9-vdonnefort@google.com> Subject: [PATCH 08/11] KVM: arm64: Add raw interface for hyp tracefs From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_041542_377490_4E7C470E X-CRM114-Status: GOOD ( 13.15 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The raw interface enables userspace tools such as trace-cmd to directly read the ring-buffer without any decoding by the kernel. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c index 3f91ad69c25b..38d97e34eada 100644 --- a/arch/arm64/kvm/hyp_trace.c +++ b/arch/arm64/kvm/hyp_trace.c @@ -746,6 +746,85 @@ static const struct file_operations hyp_trace_pipe_fops = { .release = hyp_trace_pipe_release, }; +static ssize_t +hyp_trace_raw_read(struct file *file, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + struct ht_iterator *iter = (struct ht_iterator *)file->private_data; + size_t size; + int ret; + + if (iter->copy_leftover) + goto read; + +again: + ret = ring_buffer_read_page(iter->trace_buffer, + (struct buffer_data_read_page *)iter->spare, + cnt, iter->cpu, 0); + if (ret < 0) { + if (!ring_buffer_empty_cpu(iter->trace_buffer, iter->cpu)) + return 0; + + ret = ring_buffer_wait(iter->trace_buffer, iter->cpu, 0, NULL, + NULL); + if (ret < 0) + return ret; + + goto again; + } + + iter->copy_leftover = 0; + +read: + size = PAGE_SIZE - iter->copy_leftover; + if (size > cnt) + size = cnt; + + ret = copy_to_user(ubuf, iter->spare + PAGE_SIZE - size, size); + if (ret == size) + return -EFAULT; + + size -= ret; + *ppos += size; + iter->copy_leftover = ret; + + return size; +} + +static int hyp_trace_raw_open(struct inode *inode, struct file *file) +{ + int ret = hyp_trace_pipe_open(inode, file); + struct ht_iterator *iter; + + if (ret) + return ret; + + iter = file->private_data; + iter->spare = ring_buffer_alloc_read_page(iter->trace_buffer, iter->cpu); + if (IS_ERR(iter->spare)) { + ret = PTR_ERR(iter->spare); + iter->spare = NULL; + return ret; + } + + return 0; +} + +static int hyp_trace_raw_release(struct inode *inode, struct file *file) +{ + struct ht_iterator *iter = file->private_data; + + ring_buffer_free_read_page(iter->trace_buffer, iter->cpu, iter->spare); + + return hyp_trace_pipe_release(inode, file); +} + +static const struct file_operations hyp_trace_raw_fops = { + .open = hyp_trace_raw_open, + .read = hyp_trace_raw_read, + .release = hyp_trace_raw_release, +}; + static int hyp_trace_clock_show(struct seq_file *m, void *v) { seq_puts(m, "[boot]\n"); @@ -800,6 +879,9 @@ int hyp_trace_init_tracefs(void) tracefs_create_file("trace_pipe", TRACEFS_MODE_READ, per_cpu_dir, (void *)cpu, &hyp_trace_pipe_fops); + + tracefs_create_file("trace_pipe_raw", TRACEFS_MODE_READ, per_cpu_dir, + (void *)cpu, &hyp_trace_pipe_fops); } return 0; From patchwork Mon Feb 24 12:13:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Donnefort X-Patchwork-Id: 13988022 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D9E62C021A4 for ; Mon, 24 Feb 2025 13:14:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=7FagIlwvy2X9P44Nj2FCz/VtczZ4pRRHPPhhjGlScn4=; b=0hFvLyQ98MjHx+X4rg7iUstPcL 85fAtVR36w84JC5RlGGiZ8gHmSNGM5tygOcQHiXCqYTykTfTkyQCebPdVQKDEA+tgyKFH8ayC/1T4 pWctye9Nbc8YfpkMDRFw2cTz7Sn8gi7VbqUJefr6z8wm5YLfGVZjfJkzzOYD0n02zt2wU/s7Ti0vC jfgwADkyHI3wz5ZMfZ0oP5gGDqYOzWISbRPK9VhdVq/OY7u04vm2ey1i3LAa5q91piP39PFHVFmQ/ TFwME61Nhk5UXn8EUcRPgo+FoMT0x4nV5yW4epgOb0bTTAEkqKvKuES51ay66hCumfpYuQ+LfvDEv I63Hjk3w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmYHT-0000000DnBG-1eTV; Mon, 24 Feb 2025 13:13:55 +0000 Received: from mail-wr1-x44a.google.com ([2a00:1450:4864:20::44a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmXNA-0000000DeSp-1GnO for linux-arm-kernel@lists.infradead.org; Mon, 24 Feb 2025 12:15:45 +0000 Received: by mail-wr1-x44a.google.com with SMTP id ffacd0b85a97d-38f394f6d84so3772005f8f.1 for ; Mon, 24 Feb 2025 04:15:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740399342; x=1741004142; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=7FagIlwvy2X9P44Nj2FCz/VtczZ4pRRHPPhhjGlScn4=; b=fdwK5lsfJWYiGS7VNDx8Zj7rIWS9ZP3Nri94ylCBVGL8JNF0V7+aoK9IUlQV/ZFHr/ R9G2wdBaFgU1ueaS6c2viC9WmRRA0qHCY/xb8s00atuoBaYmeIIJzAXDygiqrqs9Nvdk E52m9pYWSGbui0gZJgACKMf274Fzsnd8t3KnO6aXve4IE1Io1sexMY+/1/wZvvgaBhEm eupr2t2hDg/ApJol2Dbcuy5DWmYJ1z4sJpANa/QyWJKJNk37984s3aTDHulol5HxKZra oI9+6KixYNORZ57VWDbeQbMNw+OO4jmhlqtV/TEOHO1zVPdC9MADqSSckAIA4j3UlGOg T3IQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740399342; x=1741004142; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=7FagIlwvy2X9P44Nj2FCz/VtczZ4pRRHPPhhjGlScn4=; b=v6T546j9OA9+uoQuYZvY5mljxjDEhaRo+7T0m7MvaGp28Bv41EELrVgmm9Rz9iH8ZL NbKrYAwwcPWxLJDSpjRMsQBacaMTtyPuR6r4QQl8Qv+pn6DCF72Ewga9KmgrJYWeSCM+ JIeZE79bD1gjX9F0JQgxCF8c/WT4l9U6jmfaJpsXrGqW4IM6kF2GjLYkkIa40AoI0EM5 bFF9hZdhNsElWWROktrLmWN6zJUkg4KE0FWKGcnFs34Omat7I5i3oU/yz0TzVaT0qEwu KgLNZFJB+KuxxZYyzUbNQs16zA21S3p1zK18YSi75I727mV7vdHaqKFJ62AuMydeJ+1K Zd+Q== X-Forwarded-Encrypted: i=1; AJvYcCX/WGuh1gBFJORAVbt+J5lh8Kh8+FJkwDOZsZDSAO+sNFQV694Oi2L7ZSJbEzwTT8k1hXbRBmYVztETmJ5TDxMh@lists.infradead.org X-Gm-Message-State: AOJu0YwhGhDYa2TiE44EWR8myadEyKt9c+jHKA9ZYmER/n6luyjyHQv5 GCUMP1gGDqEoCjS91DKzBCqNe7DliDsPDJVFlnNpxbZ5rJHLcupED4KtcVKT1IJJX4/S2OTtLFQ Ny2J+xeVH0thgbkBl/Q== X-Google-Smtp-Source: AGHT+IEfGb128a2DZBA8iW3Df9rmq3CoqidzScBKQXE2CAExXpW8soetaBJWs+nYkO8TCElzGRor5RI3Vx2qFvd6 X-Received: from wrbgx18.prod.google.com ([2002:a05:6000:4712:b0:38f:478d:618d]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:45c5:0:b0:38d:e0be:71a3 with SMTP id ffacd0b85a97d-38f708593a9mr7968419f8f.54.1740399342646; Mon, 24 Feb 2025 04:15:42 -0800 (PST) Date: Mon, 24 Feb 2025 12:13:51 +0000 In-Reply-To: <20250224121353.98697-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20250224121353.98697-1-vdonnefort@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250224121353.98697-10-vdonnefort@google.com> Subject: [PATCH 09/11] KVM: arm64: Add trace interface for hyp tracefs From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_041544_344831_DD6A1AA1 X-CRM114-Status: GOOD ( 15.72 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The trace interface is solely here to reset tracing. Non-consuming read is not yet supported due to the lack of support in the ring-buffer meta page. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 87d3e0e73b68..74f10847a55e 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -91,6 +91,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_load_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_teardown_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_enable_tracing, + __KVM_HOST_SMCCC_FUNC___pkvm_reset_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_swap_reader_tracing, }; diff --git a/arch/arm64/kvm/hyp/include/nvhe/trace.h b/arch/arm64/kvm/hyp/include/nvhe/trace.h index 6f1cc571b47a..28bbb54b7a0b 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/trace.h +++ b/arch/arm64/kvm/hyp/include/nvhe/trace.h @@ -20,6 +20,7 @@ void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cy int __pkvm_load_tracing(unsigned long desc_va, size_t desc_size); void __pkvm_teardown_tracing(void); int __pkvm_enable_tracing(bool enable); +int __pkvm_reset_tracing(unsigned int cpu); int __pkvm_swap_reader_tracing(unsigned int cpu); #else static inline void *tracing_reserve_entry(unsigned long length) { return NULL; } @@ -30,6 +31,7 @@ void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cy static inline int __pkvm_load_tracing(unsigned long desc_va, size_t desc_size) { return -ENODEV; } static inline void __pkvm_teardown_tracing(void) { } static inline int __pkvm_enable_tracing(bool enable) { return -ENODEV; } +static inline int __pkvm_reset_tracing(unsigned int cpu) { return -ENODEV; } static inline int __pkvm_swap_reader_tracing(unsigned int cpu) { return -ENODEV; } #endif #endif diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index a8b497b22407..e2419c97c57d 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -605,6 +605,13 @@ static void handle___pkvm_enable_tracing(struct kvm_cpu_context *host_ctxt) cpu_reg(host_ctxt, 1) = __pkvm_enable_tracing(enable); } +static void handle___pkvm_reset_tracing(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(unsigned int, cpu, host_ctxt, 1); + + cpu_reg(host_ctxt, 1) = __pkvm_reset_tracing(cpu); +} + static void handle___pkvm_swap_reader_tracing(struct kvm_cpu_context *host_ctxt) { DECLARE_REG(unsigned int, cpu, host_ctxt, 1); @@ -655,6 +662,7 @@ static const hcall_t host_hcall[] = { HANDLE_FUNC(__pkvm_load_tracing), HANDLE_FUNC(__pkvm_teardown_tracing), HANDLE_FUNC(__pkvm_enable_tracing), + HANDLE_FUNC(__pkvm_reset_tracing), HANDLE_FUNC(__pkvm_swap_reader_tracing), }; diff --git a/arch/arm64/kvm/hyp/nvhe/trace.c b/arch/arm64/kvm/hyp/nvhe/trace.c index d79b6539377e..bf935645ed91 100644 --- a/arch/arm64/kvm/hyp/nvhe/trace.c +++ b/arch/arm64/kvm/hyp/nvhe/trace.c @@ -262,7 +262,7 @@ void tracing_commit_entry(void) smp_store_release(&cpu_buffer->status, HYP_RB_READY); } -static void hyp_rb_disable_writing(struct hyp_rb_per_cpu *cpu_buffer) +static u32 hyp_rb_disable_writing(struct hyp_rb_per_cpu *cpu_buffer) { u32 prev_status; @@ -272,6 +272,8 @@ static void hyp_rb_disable_writing(struct hyp_rb_per_cpu *cpu_buffer) HYP_RB_READY, HYP_RB_UNAVAILABLE); } while (prev_status == HYP_RB_WRITING); + + return prev_status; } static int hyp_rb_enable_writing(struct hyp_rb_per_cpu *cpu_buffer) @@ -284,6 +286,44 @@ static int hyp_rb_enable_writing(struct hyp_rb_per_cpu *cpu_buffer) return 0; } +static int hyp_rb_reset(struct hyp_rb_per_cpu *cpu_buffer) +{ + struct hyp_buffer_page *bpage; + u32 prev_status; + + if (!hyp_rb_loaded(cpu_buffer)) + return -ENODEV; + + prev_status = hyp_rb_disable_writing(cpu_buffer); + + while (!hyp_bpage_is_head(cpu_buffer->head_page)) + cpu_buffer->head_page = hyp_bpage_next_page(cpu_buffer->head_page); + + bpage = cpu_buffer->tail_page = cpu_buffer->head_page; + do { + hyp_bpage_reset(bpage); + bpage = hyp_bpage_next_page(bpage); + } while (bpage != cpu_buffer->head_page); + + hyp_bpage_reset(cpu_buffer->reader_page); + + cpu_buffer->last_overrun = 0; + cpu_buffer->write_stamp = 0; + + cpu_buffer->meta->reader.read = 0; + cpu_buffer->meta->reader.lost_events = 0; + cpu_buffer->meta->entries = 0; + cpu_buffer->meta->overrun = 0; + cpu_buffer->meta->read = 0; + meta_pages_lost(cpu_buffer->meta) = 0; + meta_pages_touched(cpu_buffer->meta) = 0; + + if (prev_status == HYP_RB_READY) + hyp_rb_enable_writing(cpu_buffer); + + return 0; +} + static void hyp_rb_teardown(struct hyp_rb_per_cpu *cpu_buffer) { int i; @@ -572,3 +612,17 @@ int __pkvm_enable_tracing(bool enable) return ret; } + +int __pkvm_reset_tracing(unsigned int cpu) +{ + int ret = 0; + + if (cpu >= hyp_nr_cpus) + return -EINVAL; + + hyp_spin_lock(&trace_rb_lock); + ret = hyp_rb_reset(per_cpu_ptr(&trace_rb, cpu)); + hyp_spin_unlock(&trace_rb_lock); + + return ret; +} diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c index 38d97e34eada..03a6813cbe66 100644 --- a/arch/arm64/kvm/hyp_trace.c +++ b/arch/arm64/kvm/hyp_trace.c @@ -200,6 +200,11 @@ static int __get_reader_page(int cpu) return kvm_call_hyp_nvhe(__pkvm_swap_reader_tracing, cpu); } +static int __reset(int cpu) +{ + return kvm_call_hyp_nvhe(__pkvm_reset_tracing, cpu); +} + static void hyp_trace_free_pages(struct hyp_trace_desc *desc) { struct rb_page_desc *rb_desc; @@ -361,6 +366,7 @@ static int hyp_trace_buffer_load(struct hyp_trace_buffer *hyp_buffer, size_t siz hyp_buffer->remote.pdesc = &desc->page_desc; hyp_buffer->remote.get_reader_page = __get_reader_page; + hyp_buffer->remote.reset = __reset; hyp_buffer->trace_buffer = ring_buffer_remote(&hyp_buffer->remote); if (!hyp_buffer->trace_buffer) { ret = -ENOMEM; @@ -825,6 +831,49 @@ static const struct file_operations hyp_trace_raw_fops = { .release = hyp_trace_raw_release, }; +static void hyp_trace_reset(int cpu) +{ + struct hyp_trace_buffer *hyp_buffer = &hyp_trace_buffer; + + mutex_lock(&hyp_buffer->lock); + + if (!hyp_trace_buffer_loaded(hyp_buffer)) + goto out; + + if (cpu == RING_BUFFER_ALL_CPUS) + ring_buffer_reset(hyp_buffer->trace_buffer); + else + ring_buffer_reset_cpu(hyp_buffer->trace_buffer, cpu); + +out: + mutex_unlock(&hyp_buffer->lock); +} + +static int hyp_trace_open(struct inode *inode, struct file *file) +{ + int cpu = (s64)inode->i_private; + + if (file->f_mode & FMODE_WRITE) { + hyp_trace_reset(cpu); + + return 0; + } + + return -EPERM; +} + +static ssize_t hyp_trace_write(struct file *filp, const char __user *ubuf, + size_t count, loff_t *ppos) +{ + return count; +} + +static const struct file_operations hyp_trace_fops = { + .open = hyp_trace_open, + .write = hyp_trace_write, + .release = NULL, +}; + static int hyp_trace_clock_show(struct seq_file *m, void *v) { seq_puts(m, "[boot]\n"); @@ -857,6 +906,9 @@ int hyp_trace_init_tracefs(void) tracefs_create_file("trace_pipe", TRACEFS_MODE_WRITE, root, (void *)RING_BUFFER_ALL_CPUS, &hyp_trace_pipe_fops); + tracefs_create_file("trace", TRACEFS_MODE_WRITE, root, + (void *)RING_BUFFER_ALL_CPUS, &hyp_trace_fops); + tracefs_create_file("trace_clock", TRACEFS_MODE_READ, root, NULL, &hyp_trace_clock_fops); @@ -882,6 +934,9 @@ int hyp_trace_init_tracefs(void) tracefs_create_file("trace_pipe_raw", TRACEFS_MODE_READ, per_cpu_dir, (void *)cpu, &hyp_trace_pipe_fops); + + tracefs_create_file("trace", TRACEFS_MODE_READ, per_cpu_dir, + (void *)cpu, &hyp_trace_fops); } return 0; From patchwork Mon Feb 24 12:13:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Donnefort X-Patchwork-Id: 13988028 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 59037C021A6 for ; Mon, 24 Feb 2025 13:14:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=UQnKgBjbhLrkQnSkRGlTGMRUQsY7Lj3HimnUr6EhSwI=; b=cCNuoA0etCKGlFIvkM3kcFNfXb BPxl9U/Sebm1q9woFBE/sGjW6K0ReL0nK6T+acJcPZhiUYFJerSZL1PEddmDdW0a4luSxhYDAX5Rz L3W529BAvgHTC0mfuF7TaSADYFXfNKARYVN586rpi3Groj0nJ14jze1Jf+D1aMDR084+L/dwZoN7k jTpgGiXdQc08jVIdn0XiJ5UrJyucE/CoqAH18iQMUFPyT6Ds0rgf9mOpqBoNQTD3kjQ4sBL5V1gK7 cJ2Zlcihu/h4rYKVPdmIwTxCu7k4iuE8j4nTgrwUjobChbDo6GO51zfojz0ChiL8Q1zghVbNAWzKh wToBC6+Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmYHU-0000000DnCj-1Kz7; Mon, 24 Feb 2025 13:13:56 +0000 Received: from mail-wr1-x44a.google.com ([2a00:1450:4864:20::44a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmXNC-0000000DeUS-2EHo for linux-arm-kernel@lists.infradead.org; Mon, 24 Feb 2025 12:15:48 +0000 Received: by mail-wr1-x44a.google.com with SMTP id ffacd0b85a97d-38f29b4d4adso1791083f8f.3 for ; Mon, 24 Feb 2025 04:15:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740399345; x=1741004145; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=UQnKgBjbhLrkQnSkRGlTGMRUQsY7Lj3HimnUr6EhSwI=; b=bEEqdWP5chmSLjgsroKnBG0c+swX56c+3YnVDsTVEogsWb2LdbViH59hCFjIj8SBAS xDi2R11QCZQW2rZx+gZkg9g9WZP7KP2kGiB5d0RW/lvh634YRV1PaVYl4fTFWZcXxrhq fo0siijHTEsM3eKtqQh+BAJay0Mkf4EBy9cima0qXyn6lNOcz0a4qNlBGJRIHu41txcB 5SMD67yqU1kijIEHDSJ0U2YmlTcVLAOeVLkf3oQTbPFIF/80JAYgUEdiRCIBcr4yTiy3 VwXE8xNjuo92s7sqdQs+O/JtvMy8zl7/bGdOTsIas8smnQW6Ifh7SwVI+Knn+CLkDeBM 7t0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740399345; x=1741004145; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=UQnKgBjbhLrkQnSkRGlTGMRUQsY7Lj3HimnUr6EhSwI=; b=Ozoe/yVXaEuPhxC8GgJR2SuvHO/at+fc0N+13br4DGu/KhcU4N3RpOWQqk8gOLNOBP BYGZ+wo1WgkmdFCWObW8nRZQoj5hwNkij3N9FXBDzSlueNgICXmyxrNj6C/4CCSPzTiI A0YSDfRH5pBCq7E12Nd7ZbvtB8IzSqC+HvavIke2OznuC7tfEbnwaYC4XyURVD3AvUgE 2IutHfb/fBtQFghxSTGmpxStH/FFvVOwdDoMc8goLidiuuPAM3riwy/p+gNX25kvI+pN 5mevCrUHs3nDJ+KtNalG3EXWNVL/blBC3KUoEtAAOxiNzzNwysn3TAumhSs9Tla4WuCa cjWw== X-Forwarded-Encrypted: i=1; AJvYcCUjZ0eMh45L8u7XOjFK4FQcAL/Ocf04vJwRJnEwDxsbWlSyCKq3hiML+YLMqY6WDZDc3UNdb8eiZZTU8C7+tAWE@lists.infradead.org X-Gm-Message-State: AOJu0YwI5e/Y9GyFvQRROmX4BUNGnMKVF8AVrdUeEEkjOmLGu8UlUNRW bsd+XA6JHxZzR0c108LYO5T9HkaNzFw4UgqDm2YZoci6IANy9z9pytrJps/3qY5ZrSN4Bhnf7gI rwab2d7YUYs89LbX/4Q== X-Google-Smtp-Source: AGHT+IGghJ9+3keDGR7fK+jRC01bAljLd67PjSrmcsbteVdOQh53h1FC3/LFWVU+Yt2UpUqgfV6D+Z6PjkYrPJg1 X-Received: from wmbep11.prod.google.com ([2002:a05:600c:840b:b0:439:8e38:3081]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:47c9:0:b0:38f:31fe:6d23 with SMTP id ffacd0b85a97d-38f6e967ed3mr10983617f8f.23.1740399344814; Mon, 24 Feb 2025 04:15:44 -0800 (PST) Date: Mon, 24 Feb 2025 12:13:52 +0000 In-Reply-To: <20250224121353.98697-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20250224121353.98697-1-vdonnefort@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250224121353.98697-11-vdonnefort@google.com> Subject: [PATCH 10/11] KVM: arm64: Add support for hyp events From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_041546_582074_9A1ED51C X-CRM114-Status: GOOD ( 19.80 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Following the introduction of hyp tracing for pKVM, add the ability to describe and emit events into the hypervisor ring-buffers. Hypervisor events are declared into kvm_hypevents.h and can be called with trace_() in a similar fashion to the kernel tracefs events. hyp_enter and hyp_exit events are provided as an example. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 74f10847a55e..afeb983ca97b 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -93,6 +93,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_enable_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_reset_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_swap_reader_tracing, + __KVM_HOST_SMCCC_FUNC___pkvm_enable_event, }; #define DECLARE_KVM_VHE_SYM(sym) extern char sym[] diff --git a/arch/arm64/include/asm/kvm_define_hypevents.h b/arch/arm64/include/asm/kvm_define_hypevents.h new file mode 100644 index 000000000000..cda7dc27dba7 --- /dev/null +++ b/arch/arm64/include/asm/kvm_define_hypevents.h @@ -0,0 +1,61 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include + +#include +#include + +#ifndef HYP_EVENT_FILE +# undef __ARM64_KVM_HYPEVENTS_H_ +# define __HYP_EVENT_FILE +#else +# define __HYP_EVENT_FILE __stringify(HYP_EVENT_FILE) +#endif + +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + HYP_EVENT_FORMAT(__name, __struct); \ + static void hyp_event_trace_##__name(struct ht_iterator *iter) \ + { \ + struct trace_hyp_format_##__name __maybe_unused *__entry = \ + (struct trace_hyp_format_##__name *)iter->ent; \ + trace_seq_puts(&iter->seq, #__name); \ + trace_seq_putc(&iter->seq, ' '); \ + trace_seq_printf(&iter->seq, __printk); \ + trace_seq_putc(&iter->seq, '\n'); \ + } +#define HYP_EVENT_MULTI_READ +#include __HYP_EVENT_FILE + +#undef he_field +#define he_field(_type, _item) \ + { \ + .type = #_type, .name = #_item, \ + .size = sizeof(_type), .align = __alignof__(_type), \ + .is_signed = is_signed_type(_type), \ + }, +#undef HYP_EVENT +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + static struct trace_event_fields hyp_event_fields_##__name[] = { \ + __struct \ + {} \ + }; +#include __HYP_EVENT_FILE + +#undef HYP_EVENT +#undef HE_PRINTK +#define __entry REC +#define HE_PRINTK(fmt, args...) "\"" fmt "\", " __stringify(args) +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + static char hyp_event_print_fmt_##__name[] = __printk; \ + static bool hyp_event_enabled_##__name; \ + struct hyp_event __section("_hyp_events."#__name) \ + hyp_event_##__name = { \ + .name = #__name, \ + .enabled = &hyp_event_enabled_##__name, \ + .fields = hyp_event_fields_##__name, \ + .print_fmt = hyp_event_print_fmt_##__name, \ + .trace_func = hyp_event_trace_##__name, \ + } +#include __HYP_EVENT_FILE + +#undef HYP_EVENT_MULTI_READ diff --git a/arch/arm64/include/asm/kvm_hypevents.h b/arch/arm64/include/asm/kvm_hypevents.h new file mode 100644 index 000000000000..0b98a87a1250 --- /dev/null +++ b/arch/arm64/include/asm/kvm_hypevents.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#if !defined(__ARM64_KVM_HYPEVENTS_H_) || defined(HYP_EVENT_MULTI_READ) +#define __ARM64_KVM_HYPEVENTS_H_ + +#ifdef __KVM_NVHE_HYPERVISOR__ +#include +#endif + +/* + * Hypervisor events definitions. + */ + +HYP_EVENT(hyp_enter, + HE_PROTO(void), + HE_STRUCT( + ), + HE_ASSIGN( + ), + HE_PRINTK(" ") +); + +HYP_EVENT(hyp_exit, + HE_PROTO(void), + HE_STRUCT( + ), + HE_ASSIGN( + ), + HE_PRINTK(" ") +); +#endif diff --git a/arch/arm64/include/asm/kvm_hypevents_defs.h b/arch/arm64/include/asm/kvm_hypevents_defs.h new file mode 100644 index 000000000000..473bf4363d82 --- /dev/null +++ b/arch/arm64/include/asm/kvm_hypevents_defs.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __ARM64_KVM_HYPEVENTS_DEFS_H +#define __ARM64_KVM_HYPEVENTS_DEFS_H + +struct hyp_event_id { + unsigned short id; + void *data; +}; + +#define HYP_EVENT_NAME_MAX 32 + +struct hyp_event { + char name[HYP_EVENT_NAME_MAX]; + bool *enabled; + char *print_fmt; + struct trace_event_fields *fields; + void (*trace_func)(struct ht_iterator *iter); + int id; +}; + +struct hyp_entry_hdr { + unsigned short id; +}; + +/* + * Hyp events definitions common to the hyp and the host + */ +#define HYP_EVENT_FORMAT(__name, __struct) \ + struct __packed trace_hyp_format_##__name { \ + struct hyp_entry_hdr hdr; \ + __struct \ + } + +#define HE_PROTO(args...) args +#define HE_STRUCT(args...) args +#define HE_ASSIGN(args...) args +#define HE_PRINTK(args...) args + +#define he_field(type, item) type item; +#endif diff --git a/arch/arm64/include/asm/kvm_hyptrace.h b/arch/arm64/include/asm/kvm_hyptrace.h index 7da6a248c7fa..7b66bd06537f 100644 --- a/arch/arm64/include/asm/kvm_hyptrace.h +++ b/arch/arm64/include/asm/kvm_hyptrace.h @@ -4,6 +4,22 @@ #include #include +#include +#include + +struct ht_iterator { + struct trace_buffer *trace_buffer; + int cpu; + struct hyp_entry_hdr *ent; + unsigned long lost_events; + int ent_cpu; + size_t ent_size; + u64 ts; + void *spare; + size_t copy_leftover; + struct trace_seq seq; + struct delayed_work poll_work; +}; /* * Host donations to the hypervisor to store the struct hyp_buffer_page. diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index ef3a69cc398e..3b8bf8ded48c 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -137,6 +137,10 @@ KVM_NVHE_ALIAS(__hyp_bss_start); KVM_NVHE_ALIAS(__hyp_bss_end); KVM_NVHE_ALIAS(__hyp_rodata_start); KVM_NVHE_ALIAS(__hyp_rodata_end); +#ifdef CONFIG_TRACING +KVM_NVHE_ALIAS(__hyp_event_ids_start); +KVM_NVHE_ALIAS(__hyp_event_ids_end); +#endif /* pKVM static key */ KVM_NVHE_ALIAS(kvm_protected_mode_initialized); diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index e73326bd3ff7..14e52049c4e5 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -13,12 +13,23 @@ *(__kvm_ex_table) \ __stop___kvm_ex_table = .; +#ifdef CONFIG_TRACING +#define HYPERVISOR_EVENT_IDS \ + . = ALIGN(PAGE_SIZE); \ + __hyp_event_ids_start = .; \ + *(HYP_SECTION_NAME(.event_ids)) \ + __hyp_event_ids_end = .; +#else +#define HYPERVISOR_EVENT_IDS +#endif + #define HYPERVISOR_DATA_SECTIONS \ HYP_SECTION_NAME(.rodata) : { \ . = ALIGN(PAGE_SIZE); \ __hyp_rodata_start = .; \ *(HYP_SECTION_NAME(.data..ro_after_init)) \ *(HYP_SECTION_NAME(.rodata)) \ + HYPERVISOR_EVENT_IDS \ . = ALIGN(PAGE_SIZE); \ __hyp_rodata_end = .; \ } @@ -201,6 +212,13 @@ SECTIONS ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!") +#ifdef CONFIG_TRACING + .rodata.hyp_events : { + __hyp_events_start = .; + *(SORT(_hyp_events.*)) + __hyp_events_end = .; + } +#endif /* code sections that are never executed via the kernel mapping */ .rodata.text : { TRAMP_TEXT diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index 865971bb8905..f9e208273031 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -29,7 +29,7 @@ kvm-$(CONFIG_HW_PERF_EVENTS) += pmu-emul.o pmu.o kvm-$(CONFIG_ARM64_PTR_AUTH) += pauth.o kvm-$(CONFIG_PTDUMP_STAGE2_DEBUGFS) += ptdump.o -kvm-$(CONFIG_TRACING) += hyp_trace.o +kvm-$(CONFIG_TRACING) += hyp_events.o hyp_trace.o always-y := hyp_constants.h hyp-constants.s diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index f3951d36b9c1..2f1b869efc80 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -2644,6 +2644,8 @@ static int __init init_hyp_mode(void) kvm_hyp_init_symbols(); + hyp_trace_init_events(); + if (is_protected_kvm_enabled()) { if (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL) && cpus_have_final_cap(ARM64_HAS_ADDRESS_AUTH)) diff --git a/arch/arm64/kvm/hyp/include/nvhe/arm-smccc.h b/arch/arm64/kvm/hyp/include/nvhe/arm-smccc.h new file mode 100644 index 000000000000..4b69d33e4f2d --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/arm-smccc.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#include + +#include + +#undef arm_smccc_1_1_smc +#define arm_smccc_1_1_smc(...) \ + do { \ + trace_hyp_exit(); \ + __arm_smccc_1_1(SMCCC_SMC_INST, __VA_ARGS__); \ + trace_hyp_enter(); \ + } while (0) diff --git a/arch/arm64/kvm/hyp/include/nvhe/define_events.h b/arch/arm64/kvm/hyp/include/nvhe/define_events.h new file mode 100644 index 000000000000..2298b49cb355 --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/define_events.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef HYP_EVENT_FILE +# define __HYP_EVENT_FILE +#else +# define __HYP_EVENT_FILE __stringify(HYP_EVENT_FILE) +#endif + +#undef HYP_EVENT +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + atomic_t __ro_after_init __name##_enabled = ATOMIC_INIT(0); \ + struct hyp_event_id hyp_event_id_##__name \ + __section(".hyp.event_ids."#__name) = { \ + .data = (void *)&__name##_enabled, \ + } + +#define HYP_EVENT_MULTI_READ +#include __HYP_EVENT_FILE +#undef HYP_EVENT_MULTI_READ + +#undef HYP_EVENT diff --git a/arch/arm64/kvm/hyp/include/nvhe/trace.h b/arch/arm64/kvm/hyp/include/nvhe/trace.h index 28bbb54b7a0b..1bf9c5e61aee 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/trace.h +++ b/arch/arm64/kvm/hyp/include/nvhe/trace.h @@ -2,6 +2,7 @@ #ifndef __ARM64_KVM_HYP_NVHE_TRACE_H #define __ARM64_KVM_HYP_NVHE_TRACE_H #include +#include /* Internal struct exported for hyp-constants.c */ struct hyp_buffer_page { @@ -15,6 +16,24 @@ struct hyp_buffer_page { #ifdef CONFIG_TRACING void *tracing_reserve_entry(unsigned long length); void tracing_commit_entry(void); +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + HYP_EVENT_FORMAT(__name, __struct); \ + extern atomic_t __name##_enabled; \ + extern struct hyp_event_id hyp_event_id_##__name; \ + static __always_inline void trace_##__name(__proto) \ + { \ + size_t length = sizeof(struct trace_hyp_format_##__name); \ + struct trace_hyp_format_##__name *__entry; \ + \ + if (!atomic_read(&__name##_enabled)) \ + return; \ + __entry = tracing_reserve_entry(length); \ + if (!__entry) \ + return; \ + __entry->hdr.id = hyp_event_id_##__name.id; \ + __assign \ + tracing_commit_entry(); \ + } void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc); int __pkvm_load_tracing(unsigned long desc_va, size_t desc_size); @@ -22,9 +41,12 @@ void __pkvm_teardown_tracing(void); int __pkvm_enable_tracing(bool enable); int __pkvm_reset_tracing(unsigned int cpu); int __pkvm_swap_reader_tracing(unsigned int cpu); +int __pkvm_enable_event(unsigned short id, bool enable); #else static inline void *tracing_reserve_entry(unsigned long length) { return NULL; } static inline void tracing_commit_entry(void) { } +#define HYP_EVENT(__name, __proto, __struct, __assign, __printk) \ + static inline void trace_##__name(__proto) {} static inline void __pkvm_update_clock_tracing(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc) { } @@ -33,5 +55,6 @@ static inline void __pkvm_teardown_tracing(void) { } static inline int __pkvm_enable_tracing(bool enable) { return -ENODEV; } static inline int __pkvm_reset_tracing(unsigned int cpu) { return -ENODEV; } static inline int __pkvm_swap_reader_tracing(unsigned int cpu) { return -ENODEV; } +static inline int __pkvm_enable_event(unsigned short id, bool enable) { return -ENODEV; } #endif #endif diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile index 40f243c44cf5..fc11e47a1e90 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -28,7 +28,7 @@ hyp-obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o hyp-obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \ ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o hyp-obj-$(CONFIG_LIST_HARDENED) += list_debug.o -hyp-obj-$(CONFIG_TRACING) += clock.o trace.o +hyp-obj-$(CONFIG_TRACING) += clock.o events.o trace.o hyp-obj-y += $(lib-objs) ## diff --git a/arch/arm64/kvm/hyp/nvhe/events.c b/arch/arm64/kvm/hyp/nvhe/events.c new file mode 100644 index 000000000000..5905b42cb0d0 --- /dev/null +++ b/arch/arm64/kvm/hyp/nvhe/events.c @@ -0,0 +1,36 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2025 Google LLC + * Author: Vincent Donnefort + */ + +#include +#include + +#include + +extern struct hyp_event_id __hyp_event_ids_start[]; +extern struct hyp_event_id __hyp_event_ids_end[]; + +int __pkvm_enable_event(unsigned short id, bool enable) +{ + struct hyp_event_id *event_id = __hyp_event_ids_start; + atomic_t *enable_key; + + for (; (unsigned long)event_id < (unsigned long)__hyp_event_ids_end; + event_id++) { + if (event_id->id != id) + continue; + + enable_key = (atomic_t *)event_id->data; + enable_key = hyp_fixmap_map(__hyp_pa(enable_key)); + + atomic_set(enable_key, enable); + + hyp_fixmap_unmap(); + + return 0; + } + + return -EINVAL; +} diff --git a/arch/arm64/kvm/hyp/nvhe/ffa.c b/arch/arm64/kvm/hyp/nvhe/ffa.c index e433dfab882a..6c740f8dcf82 100644 --- a/arch/arm64/kvm/hyp/nvhe/ffa.c +++ b/arch/arm64/kvm/hyp/nvhe/ffa.c @@ -26,10 +26,10 @@ * the duration and are therefore serialised. */ -#include #include #include +#include #include #include #include diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index e2419c97c57d..96dde58f4984 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -619,6 +620,14 @@ static void handle___pkvm_swap_reader_tracing(struct kvm_cpu_context *host_ctxt) cpu_reg(host_ctxt, 1) = __pkvm_swap_reader_tracing(cpu); } +static void handle___pkvm_enable_event(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(unsigned short, id, host_ctxt, 1); + DECLARE_REG(bool, enable, host_ctxt, 2); + + cpu_reg(host_ctxt, 1) = __pkvm_enable_event(id, enable); +} + typedef void (*hcall_t)(struct kvm_cpu_context *); #define HANDLE_FUNC(x) [__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x @@ -664,6 +673,7 @@ static const hcall_t host_hcall[] = { HANDLE_FUNC(__pkvm_enable_tracing), HANDLE_FUNC(__pkvm_reset_tracing), HANDLE_FUNC(__pkvm_swap_reader_tracing), + HANDLE_FUNC(__pkvm_enable_event), }; static void handle_host_hcall(struct kvm_cpu_context *host_ctxt) @@ -704,7 +714,9 @@ static void handle_host_hcall(struct kvm_cpu_context *host_ctxt) static void default_host_smc_handler(struct kvm_cpu_context *host_ctxt) { + trace_hyp_exit(); __kvm_hyp_host_forward_smc(host_ctxt); + trace_hyp_enter(); } static void handle_host_smc(struct kvm_cpu_context *host_ctxt) @@ -728,6 +740,8 @@ void handle_trap(struct kvm_cpu_context *host_ctxt) { u64 esr = read_sysreg_el2(SYS_ESR); + trace_hyp_enter(); + switch (ESR_ELx_EC(esr)) { case ESR_ELx_EC_HVC64: handle_host_hcall(host_ctxt); @@ -742,4 +756,6 @@ void handle_trap(struct kvm_cpu_context *host_ctxt) default: BUG(); } + + trace_hyp_exit(); } diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S index f4562f417d3f..2f9262057bac 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S +++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S @@ -16,6 +16,12 @@ SECTIONS { HYP_SECTION(.text) HYP_SECTION(.data..ro_after_init) HYP_SECTION(.rodata) +#ifdef CONFIG_TRACING + . = ALIGN(PAGE_SIZE); + BEGIN_HYP_SECTION(.event_ids) + *(SORT(.hyp.event_ids.*)) + END_HYP_SECTION +#endif /* * .hyp..data..percpu needs to be page aligned to maintain the same diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c b/arch/arm64/kvm/hyp/nvhe/psci-relay.c index 9c2ce1e0e99a..00bc2ab94d59 100644 --- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c +++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c @@ -6,11 +6,12 @@ #include #include +#include #include -#include #include #include +#include #include #include @@ -153,6 +154,7 @@ static int psci_cpu_suspend(u64 func_id, struct kvm_cpu_context *host_ctxt) DECLARE_REG(u64, power_state, host_ctxt, 1); DECLARE_REG(unsigned long, pc, host_ctxt, 2); DECLARE_REG(unsigned long, r0, host_ctxt, 3); + int ret; struct psci_boot_args *boot_args; struct kvm_nvhe_init_params *init_params; @@ -171,9 +173,11 @@ static int psci_cpu_suspend(u64 func_id, struct kvm_cpu_context *host_ctxt) * Will either return if shallow sleep state, or wake up into the entry * point if it is a deep sleep state. */ - return psci_call(func_id, power_state, - __hyp_pa(&kvm_hyp_cpu_resume), - __hyp_pa(init_params)); + ret = psci_call(func_id, power_state, + __hyp_pa(&kvm_hyp_cpu_resume), + __hyp_pa(init_params)); + + return ret; } static int psci_system_suspend(u64 func_id, struct kvm_cpu_context *host_ctxt) @@ -205,6 +209,7 @@ asmlinkage void __noreturn __kvm_host_psci_cpu_entry(bool is_cpu_on) struct psci_boot_args *boot_args; struct kvm_cpu_context *host_ctxt; + trace_hyp_enter(); host_ctxt = host_data_ptr(host_ctxt); if (is_cpu_on) @@ -218,6 +223,7 @@ asmlinkage void __noreturn __kvm_host_psci_cpu_entry(bool is_cpu_on) if (is_cpu_on) release_boot_args(boot_args); + trace_hyp_exit(); __host_enter(host_ctxt); } diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index 7d2ba6ef0261..bbe035acda89 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -7,7 +7,6 @@ #include #include -#include #include #include #include @@ -21,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -349,10 +349,13 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu) __debug_switch_to_guest(vcpu); do { + trace_hyp_exit(); + /* Jump in the fire! */ exit_code = __guest_enter(vcpu); /* And we're baaack! */ + trace_hyp_enter(); } while (fixup_guest_exit(vcpu, &exit_code)); __sysreg_save_state_nvhe(guest_ctxt); diff --git a/arch/arm64/kvm/hyp_events.c b/arch/arm64/kvm/hyp_events.c new file mode 100644 index 000000000000..ec56a63f3451 --- /dev/null +++ b/arch/arm64/kvm/hyp_events.c @@ -0,0 +1,159 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2023 Google LLC + */ + +#include + +#include +#include +#include + +#include "hyp_trace.h" + +extern struct hyp_event __hyp_events_start[]; +extern struct hyp_event __hyp_events_end[]; + +/* hyp_event section used by the hypervisor */ +extern struct hyp_event_id __hyp_event_ids_start[]; +extern struct hyp_event_id __hyp_event_ids_end[]; + +static ssize_t +hyp_event_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *ppos) +{ + struct seq_file *seq_file = (struct seq_file *)filp->private_data; + struct hyp_event *evt = (struct hyp_event *)seq_file->private; + unsigned short id = evt->id; + bool enabling; + int ret; + char c; + + if (!cnt || cnt > 2) + return -EINVAL; + + if (get_user(c, ubuf)) + return -EFAULT; + + switch (c) { + case '1': + enabling = true; + break; + case '0': + enabling = false; + break; + default: + return -EINVAL; + } + + if (enabling != *evt->enabled) { + ret = kvm_call_hyp_nvhe(__pkvm_enable_event, id, enabling); + if (ret) + return ret; + } + + *evt->enabled = enabling; + + return cnt; +} + +static int hyp_event_show(struct seq_file *m, void *v) +{ + struct hyp_event *evt = (struct hyp_event *)m->private; + + seq_printf(m, "%d\n", *evt->enabled); + + return 0; +} + +static int hyp_event_open(struct inode *inode, struct file *filp) +{ + return single_open(filp, hyp_event_show, inode->i_private); +} + +static const struct file_operations hyp_event_fops = { + .open = hyp_event_open, + .write = hyp_event_write, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static int hyp_event_id_show(struct seq_file *m, void *v) +{ + struct hyp_event *evt = (struct hyp_event *)m->private; + + seq_printf(m, "%d\n", evt->id); + + return 0; +} + +static int hyp_event_id_open(struct inode *inode, struct file *filp) +{ + return single_open(filp, hyp_event_id_show, inode->i_private); +} + +static const struct file_operations hyp_event_id_fops = { + .open = hyp_event_id_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +void hyp_trace_init_event_tracefs(struct dentry *parent) +{ + struct hyp_event *event = __hyp_events_start; + + parent = tracefs_create_dir("events", parent); + if (!parent) { + pr_err("Failed to create tracefs folder for hyp events\n"); + return; + } + + parent = tracefs_create_dir("hypervisor", parent); + if (!parent) { + pr_err("Failed to create tracefs folder for hyp events\n"); + return; + } + + for (; (unsigned long)event < (unsigned long)__hyp_events_end; event++) { + struct dentry *event_dir = tracefs_create_dir(event->name, parent); + + if (!event_dir) { + pr_err("Failed to create events/hypervisor/%s\n", + event->name); + continue; + } + + tracefs_create_file("enable", 0640, event_dir, (void *)event, + &hyp_event_fops); + tracefs_create_file("id", 0440, event_dir, (void *)event, + &hyp_event_id_fops); + } +} + +struct hyp_event *hyp_trace_find_event(int id) +{ + struct hyp_event *event = __hyp_events_start + id; + + if ((unsigned long)event >= (unsigned long)__hyp_events_end) + return NULL; + + return event; +} + +/* + * Register hyp events and write their id into the hyp section _hyp_event_ids. + */ +int hyp_trace_init_events(void) +{ + struct hyp_event_id *hyp_event_id = __hyp_event_ids_start; + struct hyp_event *event = __hyp_events_start; + int id = 0; + + /* Events on both sides hypervisor are sorted */ + for (; (unsigned long)event < (unsigned long)__hyp_events_end; + event++, hyp_event_id++, id++) + event->id = hyp_event_id->id = id; + + return 0; +} diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c index 03a6813cbe66..cb63af69c38d 100644 --- a/arch/arm64/kvm/hyp_trace.c +++ b/arch/arm64/kvm/hyp_trace.c @@ -6,10 +6,12 @@ #include #include +#include #include #include #include +#include #include "hyp_constants.h" #include "hyp_trace.h" @@ -567,6 +569,8 @@ static void ht_print_trace_cpu(struct ht_iterator *iter) static int ht_print_trace_fmt(struct ht_iterator *iter) { + struct hyp_event *e; + if (iter->lost_events) trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n", iter->ent_cpu, iter->lost_events); @@ -574,6 +578,12 @@ static int ht_print_trace_fmt(struct ht_iterator *iter) ht_print_trace_cpu(iter); ht_print_trace_time(iter); + e = hyp_trace_find_event(iter->ent->id); + if (e) + e->trace_func(iter); + else + trace_seq_printf(&iter->seq, "Unknown event id %d\n", iter->ent->id); + return trace_seq_has_overflowed(&iter->seq) ? -EOVERFLOW : 0; }; @@ -939,5 +949,7 @@ int hyp_trace_init_tracefs(void) (void *)cpu, &hyp_trace_fops); } + hyp_trace_init_event_tracefs(root); + return 0; } diff --git a/arch/arm64/kvm/hyp_trace.h b/arch/arm64/kvm/hyp_trace.h index 14fc06c625a6..3ac648415bf9 100644 --- a/arch/arm64/kvm/hyp_trace.h +++ b/arch/arm64/kvm/hyp_trace.h @@ -3,26 +3,13 @@ #ifndef __ARM64_KVM_HYP_TRACE_H__ #define __ARM64_KVM_HYP_TRACE_H__ -#include -#include - -struct ht_iterator { - struct trace_buffer *trace_buffer; - int cpu; - struct hyp_entry_hdr *ent; - unsigned long lost_events; - int ent_cpu; - size_t ent_size; - u64 ts; - void *spare; - size_t copy_leftover; - struct trace_seq seq; - struct delayed_work poll_work; -}; - #ifdef CONFIG_TRACING int hyp_trace_init_tracefs(void); +int hyp_trace_init_events(void); +struct hyp_event *hyp_trace_find_event(int id); +void hyp_trace_init_event_tracefs(struct dentry *parent); #else static inline int hyp_trace_init_tracefs(void) { return 0; } +static inline int hyp_trace_init_events(void) { return 0; } #endif #endif From patchwork Mon Feb 24 12:13:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Donnefort X-Patchwork-Id: 13988027 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 22E0AC021BB for ; Mon, 24 Feb 2025 13:14:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=9CAntV7cf9FOGSVoXQbvcHO1SNXeVU07stlk2XXrAaE=; b=Jk2+v4ywzFoy2R6dfd1sJtkn0v oAXLZyOR7i+HJLWFaOAJ6NqkmK2VIzCh9SF0gUKwCeMFkJiy2yXPWHVd3TZi07gTBJ3y12bHfhNlz iWGj4r8oHTxf8VjalsRR/Ihes/wVB3gEUVOQPs8w2WHMY51QhFFpOKZhe7VqoeB3ecvwunT/7I3mg A74x68Z+RdbDIqcWf0HViArGBIsIOpNIjMeeF7N8jXItXfmUKZyjU2xRNe8/jvBDUosb92FdodoAq nLYTw2I7d3tePozzyf/7rYLOqHPFEh67CPVLoBoHahGQ5p2mkYijUTbQ51lNb6XaAWPz+S/dBP/EJ dd5ol84A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmYHV-0000000DnE2-0AHJ; Mon, 24 Feb 2025 13:13:57 +0000 Received: from mail-wm1-x34a.google.com ([2a00:1450:4864:20::34a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmXNE-0000000DeW9-1nIZ for linux-arm-kernel@lists.infradead.org; Mon, 24 Feb 2025 12:15:49 +0000 Received: by mail-wm1-x34a.google.com with SMTP id 5b1f17b1804b1-438e180821aso22006325e9.1 for ; Mon, 24 Feb 2025 04:15:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740399347; x=1741004147; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9CAntV7cf9FOGSVoXQbvcHO1SNXeVU07stlk2XXrAaE=; b=ZuyDVJETHHwD8uGBy+FOa2k5AMYej3BJl/m3lcpwlDNe0uirp1oNkls2HBBEXrxWCe UL4URj7pBfKsNIJOWMW9kBKUbY4tDlqXXlv+uEVGuQ5PnwNUtGQ31FYIi+pHQ2TXvWVa fp9vEaKZ3ydpKFnSdYJtBU3EsarK+xQuoZO48MCXAHcWWUDRhml5aygfCFoZ3E1XNRGf Mtw6MHxATNeej+AafFpMkV/FrzcfWMuOMQ0LwPhHsFjS/hxHGPMaQz7VynPk3zmW+Vzx fArmASp7hiq3IrCmnV9oiPXdWE/HgV9BxmY4e4d4Fmw8E/mvoRK+yq4lWeSNMaQ8fuav kFMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740399347; x=1741004147; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9CAntV7cf9FOGSVoXQbvcHO1SNXeVU07stlk2XXrAaE=; b=WReF+oNEu/CxwVKO1Hevfm1oyInvbdO6Bv0owx9j67n7Bfo4WZW2Dggmfc8nGQezJq iYgUfeRejZCJDCw4QQ3r0XH4EGnz8uasZb6u1ln8NS446M9zF59Qv5opUhIXcuiPdcSE y/biSOBqbp99mOcx4dXITEQer4SNa7N/RANDcez1b1QYvRM6lKb2Bmsod5/t2uZQTMer d3v+cze6V1YIq+Jf4IeQvp+SFjLHXb0CbZrjQ6Fco5cc7b9ngQ0LuRirpry5YuuMcVMA p1+LzkTNIyNJc6gdgnBebt1rXl/H49AzmmkARxzSs5wImjSr2pGmp3VPaDKlnBHMIv2j INGA== X-Forwarded-Encrypted: i=1; AJvYcCX95PlZ8598niABIr2QsYmPmIQGEPSWaPHnT2NkGaifiq7B6iqVHo0MpxcXpUQmj2WM73xiXZmq4yWzfkbqRzqs@lists.infradead.org X-Gm-Message-State: AOJu0YyJqITCD9Bky/09FH5yTxoPh2nNAhf8zB+yR5tVgdtVB9K624D5 OirtNs0d04RajKS756OEjluJQAOE9kdTBHeaa53KDe2DDU1KYL3Q9ThwZVDjdFedd6RbpA8G1B6 QssQ/LQNNCBBpRoN0rQ== X-Google-Smtp-Source: AGHT+IF9r3S0MEV0pAXIHdPmAlcUav0RqTDnvruMkd3+nrJooxI1n1GLlFqOTkZUvY5CUW1Km7zrrpubG9E1i1Nk X-Received: from wmqe19.prod.google.com ([2002:a05:600c:4e53:b0:439:831e:ca7c]) (user=vdonnefort job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:468e:b0:439:9b3f:2de1 with SMTP id 5b1f17b1804b1-439ae1f199cmr103370945e9.15.1740399346852; Mon, 24 Feb 2025 04:15:46 -0800 (PST) Date: Mon, 24 Feb 2025 12:13:53 +0000 In-Reply-To: <20250224121353.98697-1-vdonnefort@google.com> Mime-Version: 1.0 References: <20250224121353.98697-1-vdonnefort@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250224121353.98697-12-vdonnefort@google.com> Subject: [PATCH 11/11] KVM: arm64: Add kselftest for tracefs hyp tracefs From: Vincent Donnefort To: rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, linux-trace-kernel@vger.kernel.org, maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, jstultz@google.com, qperret@google.com, will@kernel.org, kernel-team@android.com, linux-kernel@vger.kernel.org, Vincent Donnefort X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_041548_494553_38D60578 X-CRM114-Status: GOOD ( 18.85 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Add a test to validate the newly introduced tracefs interface for the pKVM hypervisor. Covers buffer reset, loading/unloading, extended timestamp and coherence of the tracing clock. Signed-off-by: Vincent Donnefort diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index afeb983ca97b..c798bd6ad27e 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -94,6 +94,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_reset_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_swap_reader_tracing, __KVM_HOST_SMCCC_FUNC___pkvm_enable_event, + __KVM_HOST_SMCCC_FUNC___pkvm_selftest_event, }; #define DECLARE_KVM_VHE_SYM(sym) extern char sym[] diff --git a/arch/arm64/include/asm/kvm_hypevents.h b/arch/arm64/include/asm/kvm_hypevents.h index 0b98a87a1250..25d8e23a3cc6 100644 --- a/arch/arm64/include/asm/kvm_hypevents.h +++ b/arch/arm64/include/asm/kvm_hypevents.h @@ -28,4 +28,14 @@ HYP_EVENT(hyp_exit, ), HE_PRINTK(" ") ); + +#ifdef CONFIG_PKVM_SELFTESTS +HYP_EVENT(selftest, + HE_PROTO(void), + HE_STRUCT(), + HE_ASSIGN(), + HE_PRINTK(" ") +); #endif + +#endif /* __ARM64_KVM_HYPEVENTS_H_ */ diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index ead632ad01b4..fff9d24d7771 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -46,6 +46,7 @@ menuconfig KVM config NVHE_EL2_DEBUG bool "Debug mode for non-VHE EL2 object" depends on KVM + select PKVM_SELFTESTS help Say Y here to enable the debug mode for the non-VHE KVM EL2 object. Failure reports will BUG() in the hypervisor. This is intended for @@ -83,4 +84,13 @@ config PTDUMP_STAGE2_DEBUGFS If in doubt, say N. +config PKVM_SELFTESTS + bool "Protected KVM hypervisor selftests" + depends on KVM + default n + help + Say Y here to enable pKVM hypervisor testing infrastructure. + + If unsure, say N. + endif # VIRTUALIZATION diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 96dde58f4984..8f0dafaab568 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -628,6 +628,19 @@ static void handle___pkvm_enable_event(struct kvm_cpu_context *host_ctxt) cpu_reg(host_ctxt, 1) = __pkvm_enable_event(id, enable); } +static void handle___pkvm_selftest_event(struct kvm_cpu_context *host_ctxt) +{ + int smc_ret = SMCCC_RET_NOT_SUPPORTED, ret = -EOPNOTSUPP; + +#ifdef CONFIG_PKVM_SELFTESTS + trace_selftest(); + smc_ret = SMCCC_RET_SUCCESS; + ret = 0; +#endif + cpu_reg(host_ctxt, 0) = smc_ret; + cpu_reg(host_ctxt, 1) = ret; +} + typedef void (*hcall_t)(struct kvm_cpu_context *); #define HANDLE_FUNC(x) [__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x @@ -674,6 +687,7 @@ static const hcall_t host_hcall[] = { HANDLE_FUNC(__pkvm_reset_tracing), HANDLE_FUNC(__pkvm_swap_reader_tracing), HANDLE_FUNC(__pkvm_enable_event), + HANDLE_FUNC(__pkvm_selftest_event), }; static void handle_host_hcall(struct kvm_cpu_context *host_ctxt) diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c index cb63af69c38d..1e887e430c42 100644 --- a/arch/arm64/kvm/hyp_trace.c +++ b/arch/arm64/kvm/hyp_trace.c @@ -892,6 +892,35 @@ static int hyp_trace_clock_show(struct seq_file *m, void *v) } DEFINE_SHOW_ATTRIBUTE(hyp_trace_clock); +#ifdef CONFIG_PKVM_SELFTESTS +static int selftest_event_open(struct inode *inode, struct file *file) +{ + if (file->f_mode & FMODE_WRITE) + return kvm_call_hyp_nvhe(__pkvm_selftest_event); + + return 0; +} + +static ssize_t selftest_event_write(struct file *f, const char __user *buf, + size_t cnt, loff_t *pos) +{ + return cnt; +} + +static const struct file_operations selftest_event_fops = { + .open = selftest_event_open, + .write = selftest_event_write, +}; + +static void hyp_trace_init_testing_tracefs(struct dentry *root) +{ + tracefs_create_file("selftest_event", TRACEFS_MODE_WRITE, root, NULL, + &selftest_event_fops); +} +#else +static void hyp_trace_init_testing_tracefs(struct dentry *root) { } +#endif + int hyp_trace_init_tracefs(void) { struct dentry *root, *per_cpu_root; @@ -950,6 +979,7 @@ int hyp_trace_init_tracefs(void) } hyp_trace_init_event_tracefs(root); + hyp_trace_init_testing_tracefs(root); return 0; } diff --git a/tools/testing/selftests/hyp-trace/Makefile b/tools/testing/selftests/hyp-trace/Makefile new file mode 100644 index 000000000000..2a5b2e29667e --- /dev/null +++ b/tools/testing/selftests/hyp-trace/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0 +all: + +TEST_PROGS := hyp-trace-test + +include ../lib.mk diff --git a/tools/testing/selftests/hyp-trace/config b/tools/testing/selftests/hyp-trace/config new file mode 100644 index 000000000000..135657ef550d --- /dev/null +++ b/tools/testing/selftests/hyp-trace/config @@ -0,0 +1,4 @@ +CONFIG_FTRACE=y +CONFIG_ARM64=y +CONFIG_KVM=y +CONFIG_PKVM_SELFTESTS=y diff --git a/tools/testing/selftests/hyp-trace/hyp-trace-test b/tools/testing/selftests/hyp-trace/hyp-trace-test new file mode 100755 index 000000000000..9be0c5f57160 --- /dev/null +++ b/tools/testing/selftests/hyp-trace/hyp-trace-test @@ -0,0 +1,264 @@ +#!/bin/sh -e +# SPDX-License-Identifier: GPL-2.0-only + +# hyp-trace-test - Tracefs for pKVM hypervisor test +# +# Copyright (C) 2024 - Google LLC +# Author: Vincent Donnefort +# + +log_and_die() +{ + echo "$1" + + exit 1 +} + +host_clock() +{ + # BOOTTIME clock + awk '/now/ { printf "%.6f\n", $3 / 1000000000 }' /proc/timer_list +} + +page_size() +{ + echo "$(awk '/KernelPageSize/ {print $2; exit}' /proc/self/smaps) * 1024" | bc +} + +goto_hyp_trace() +{ + if [ -d "/sys/kernel/debug/tracing/hypervisor" ]; then + cd /sys/kernel/debug/tracing/hypervisor + return + fi + + if [ -d "/sys/kernel/tracing/hypervisor" ]; then + cd /sys/kernel/tracing/hypervisor + return + fi + + echo "ERROR: hyp tracing folder not found!" + + exit 1 +} + +reset_hyp_trace() +{ + echo 0 > tracing_on + echo 0 > trace + for event in events/hypervisor/*; do + echo 0 > $event/enable + done + + assert_unloaded +} + +setup_hyp_trace() +{ + reset_hyp_trace + + echo 16 > buffer_size_kb + echo 1 > events/hypervisor/selftest/enable + echo 1 > tracing_on +} + +stop_hyp_trace() +{ + echo 0 > tracing_on +} + +hyp_trace_loaded() +{ + grep -q "(loaded)" buffer_size_kb +} + +write_events() +{ + local num="$1" + local func="$2" + + for i in $(seq 1 $num); do + echo 1 > selftest_event + [ -z "$func" -o $i -eq $num ] || eval $func + done +} + +consuming_read() +{ + local output=$1 + + cat trace_pipe > $output & + + echo $! +} + +run_test_consuming() +{ + local nr_events=$1 + local func=$2 + local tmp="$(mktemp)" + local start_ts=0 + local end_ts=0 + local pid=0 + + echo "Output trace file: $tmp" + + setup_hyp_trace + pid=$(consuming_read $tmp) + + start_ts=$(host_clock) + write_events $nr_events $func + stop_hyp_trace + end_ts=$(host_clock) + + kill $pid + validate_test $tmp $nr_events $start_ts $end_ts + + rm $tmp +} + +validate_test() +{ + local output=$1 + local expected_events=$2 + local start_ts=$3 + local end_ts=$4 + local prev_ts=$3 + local ts=0 + local num_events=0 + + IFS=$'\n' + for line in $(cat $output); do + echo "$line" | grep -q -E "^# " && continue + ts=$(echo "$line" | awk '{print $2}' | cut -d ':' -f1) + if [ $(echo "$ts<$prev_ts" | bc) -eq 1 ]; then + log_and_die "Error event @$ts < $prev_ts" + fi + prev_ts=$ts + num_events=$((num_events + 1)) + done + + if [ $(echo "$ts>$end_ts" | bc) -eq 1 ]; then + log_and_die "Error event @$ts > $end_ts" + fi + + if [ $num_events -ne $expected_events ]; then + log_and_die "Expected $expected_events events, got $num_events" + fi +} + +test_ts() +{ + echo "Test Timestamps..." + + run_test_consuming 1000 + + echo "done." +} + +test_extended_ts() +{ + echo "Test Extended Timestamps..." + + run_test_consuming 1000 "sleep 0.1" + + echo "done." +} + +assert_loaded() +{ + hyp_trace_loaded || log_and_die "Expected loaded buffer" +} + +assert_unloaded() +{ + ! hyp_trace_loaded || log_and_die "Expected unloaded buffer" +} + +test_unloading() +{ + local tmp="$(mktemp)" + + echo "Test unloading..." + + setup_hyp_trace + assert_loaded + + echo 0 > tracing_on + assert_unloaded + + pid=$(consuming_read $tmp) + sleep 1 + assert_loaded + kill $pid + assert_unloaded + + echo 1 > tracing_on + write_events 1 + echo 0 > trace + assert_loaded + echo 0 > tracing_on + assert_unloaded + + echo "done." +} + +test_reset() +{ + local tmp="$(mktemp)" + + echo "Test Reset..." + + setup_hyp_trace + write_events 1000 + echo 0 > trace + clock_before=$(host_clock) + write_events 5 + + pid=$(consuming_read $tmp) + sleep 1 + stop_hyp_trace + kill $pid + + validate_test $tmp 5 $clock_before $(host_clock) + + rm $tmp + + echo "done." +} + +test_big_bpacking() +{ + local hyp_buffer_page_size=40 + local page_size=$(page_size) + local min_buf_size + + # Number of ring-buffer pages stored in a single bpacking page + min_buf_size=$(echo "$page_size / ($hyp_buffer_page_size * $(nproc))" | bc) + # Size of the ring-buffer to fill a single bpacking page + min_buf_size=$(echo "$page_size * $min_buf_size" | bc) + # Size in kiB of the ring-buffer to fill an order-1 bpacking page + min_buf_size=$(echo "$min_buf_size * 2 / 1024" | bc) + + echo "Test loading $min_buf_size kB buffer..." + + reset_hyp_trace + + echo $min_buf_size > buffer_size_kb + echo 1 > tracing_on + assert_loaded + + stop_hyp_trace + + echo "done." +} + +goto_hyp_trace + +test_reset +test_unloading +test_big_bpacking +test_ts +test_extended_ts + +exit 0