[V3,1/8] cxl/mem: Read, trace, and clear events on driver load

From: Ira Weiny <ira.weiny@intel.com>

From: Ira Weiny <ira.weiny@intel.com>

CXL devices have multiple event logs which can be queried for CXL event
records.  Devices are required to support the storage of at least one
event record in each event log type.

Devices track event log overflow by incrementing a counter and tracking
the time of the first and last overflow event seen.

Software queries events via the Get Event Record mailbox command; CXL
rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
8.2.9.2.3 Clear Event Records mailbox command.

CXL _OSC Error Reporting Control is used by the OS to determine if
Firmware has control of various error reporting capabilities including
the event logs.

Expose the result of negotiating CXL Error Reporting Control in struct
pci_host_bridge for consumption by the CXL drivers.  If support is
controlled by the OS read and clear all event logs on driver load.

Ensure a clean slate of events by reading and clearing the events on
driver load.  The operation is performed twice to ensure that any prior
partial readings are completed and a fresh read from the start is done
at least one time.  This is done even if rogue reads cause clear errors.

The status register is not used because a device may continue to trigger
events and the only requirement is to empty the log at least once.  This
allows for the required transition from empty to non-empty for interrupt
generation.  Handling of interrupts is in a follow on patch.

The device can return up to 1MB worth of event records per query.
Allocate a shared large buffer to handle the max number of records based
on the mailbox payload size.

This patch traces a raw event record and leaves specific event record
type tracing to subsequent patches.  Macros are created to aid in
tracing the common CXL Event header fields.

Each record is cleared explicitly.  A clear all bit is specified but is
only valid when the log overflows.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V2:
	Rebased on 6.3 pending changes
	Move cxl_mem_alloc_event_buf() to pci.c
	Define and use CXLDEV_EVENT_STATUS_ALL
	Fix error flow on clear failure
	Remove tags
	Jonathan/Dan
		Add in OSC Error Reporting Control check
	Dan (Jonathan in previous version)
		Squash Clear events and the driver load patch into this one.
	Dan
		Make event device status a separate structure
		Move tracing to within cxl core
		Reduce clear double loop to a single loop
		Pass struct device to trace points
		Adjust to new cxl_internal_send_cmd()
		Query error logs in order of severity fatal -> Info
		Remove uapi defines entirely
		pass total via get_pl
		fix 'Clearning' spelling
		Better clarify event_buf singular allocation
		Use decimal for command payload array sizes
		Remove trace_*_enabled() optimization
		Put GET/CLEAR macros at the end of the user enum to
		preserve compatibility
		Add Get/Clear Events to kernel exclusive commands
		Remove cxl_event_log_type_str() outside of tracing
		Add cond_resched() to event log processing
	Jonathan
		s/event_buf_lock/event_log_lock
		Read through all logs two times to ensure partial reads are
			covered.
		Pass buffer to cxl_mem_free_event_buffer()
		kdoc for event buf
		Account for cxlds->payload_size limiting the max handles
		Don't use min_t as it was used incorrectly

Changes from V1:
	Clear Event Record allows for u8 handles while Get Event Record
	allows for u16 records to be returned.  Based on Jonathan's
	feedback; allow for all event records to be handled in this
	clear.  Which means a double loop with potentially multiple
	Clear Event payloads being sent to clear all events sent.

Changes from RFC:
	Jonathan
		Clean up init of payload and use return code.
		Also report any error to clear the event.
		s/v3.0/rev 3.0

squash: make event device state a separate structure.
---
 drivers/acpi/pci_root.c  |   3 +
 drivers/cxl/core/mbox.c  | 138 +++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h        |  12 ++++
 drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
 drivers/cxl/pci.c        |  42 ++++++++++++
 include/linux/pci.h      |   1 +
 7 files changed, 400 insertions(+)

Message ID	20221208052115.800170-2-ira.weiny@intel.com
State	Superseded
Headers	show Return-Path: <linux-cxl-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CDF5C4332F for <linux-cxl@archiver.kernel.org>; Thu, 8 Dec 2022 05:21:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229788AbiLHFV2 (ORCPT <rfc822;linux-cxl@archiver.kernel.org>); Thu, 8 Dec 2022 00:21:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229753AbiLHFV0 (ORCPT <rfc822;linux-cxl@vger.kernel.org>); Thu, 8 Dec 2022 00:21:26 -0500 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F04488B69; Wed, 7 Dec 2022 21:21:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670476883; x=1702012883; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QDcNWh+3bmD5dZ56bjbWQQmcoNrMzL0H9DjQ1s8Krt8=; b=eaOtwPH+aoOuBWKey/slKhFynOQAMJNaXqe4CUrNxnlaD59pPazOP5Y2 0ND3mZAnv2z7qittvKNHzcK5BL9B8BuD44vTpPSRlUQfTd2s1iRQW5sOb jzeBJWOqmPw1XoD+6AINhRtqiPRQrvhEdGzzXC+nAsGzHpCGvGVoCiyBi bqkp5WvuxWdYR8TD970aHVkeq4GAP5SmHAw0e8G2+H0rPXO+p34ZoDFxk 6RXno9pY/+m7x1++IoPXvGDVgsdcUumuuwCjAIWivdtU2DdJMb5wr18xY j1Y9LzEhrySOqfoDLhfLjf+bjCHnWtoxzN3qLNaJhuIhbz0SxuqojqaZT Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10554"; a="381367228" X-IronPort-AV: E=Sophos;i="5.96,226,1665471600"; d="scan'208";a="381367228" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Dec 2022 21:21:22 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10554"; a="710324448" X-IronPort-AV: E=Sophos;i="5.96,226,1665471600"; d="scan'208";a="710324448" Received: from iweiny-mobl.amr.corp.intel.com (HELO localhost) ([10.209.25.22]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Dec 2022 21:21:18 -0800 From: ira.weiny@intel.com To: Dan Williams <dan.j.williams@intel.com> Cc: Ira Weiny <ira.weiny@intel.com>, Bjorn Helgaas <bhelgaas@google.com>, Alison Schofield <alison.schofield@intel.com>, Vishal Verma <vishal.l.verma@intel.com>, Davidlohr Bueso <dave@stgolabs.net>, Jonathan Cameron <Jonathan.Cameron@huawei.com>, Dave Jiang <dave.jiang@intel.com>, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, linux-cxl@vger.kernel.org Subject: [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load Date: Wed, 7 Dec 2022 21:21:07 -0800 Message-Id: <20221208052115.800170-2-ira.weiny@intel.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221208052115.800170-1-ira.weiny@intel.com> References: <20221208052115.800170-1-ira.weiny@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-cxl.vger.kernel.org> X-Mailing-List: linux-cxl@vger.kernel.org
Series	CXL: Process event logs \| expand [V3,0/8] CXL: Process event logs [V3,1/8] cxl/mem: Read, trace, and clear events on driver load [V3,2/8] cxl/mem: Wire up event interrupts [V3,3/8] cxl/mem: Trace General Media Event Record [V3,4/8] cxl/mem: Trace DRAM Event Record [V3,5/8] cxl/mem: Trace Memory Module Event Record [V3,6/8] cxl/test: Add generic mock events [V3,7/8] cxl/test: Add specific events [V3,8/8] cxl/test: Simulate event log overflow

[V3,1/8] cxl/mem: Read, trace, and clear events on driver load

Commit Message

Comments

Patch