mbox series

[v6,0/4] hw/cxl: Poison get, inject, clear

Message ID 20230519141803.29713-1-Jonathan.Cameron@huawei.com
Headers show
Series hw/cxl: Poison get, inject, clear | expand

Message

Jonathan Cameron May 19, 2023, 2:17 p.m. UTC
Note that Michael Tsirkin replied to v5 to say he has first
3 queued - the bswap one is buggy so should be dropped and
replaced with this v6.  Note changes in that patch were large
enough that I've dropped Fan Ni's tag.

Tested the problematic cross compiles using docker.

v6:
 - Fix 24 bit bswap on big endian platforms.
   I think this was broken in a bad rebase some time ago.
   Fix is to not use the general macros for this case as they
   rely on a builtin 24 bit byte swap which doesn't exist.
 - Wrong use of int128_to_64() on a variable that was uint64_t in
   the first place.

Many of the precursors listed for v4 have now been applied, but
a few minor fixes have come up in the meantime so there are still
a few precursors including the volatile support left from v4
precursors.

Michael has the above series + initial patches of v5 of this series
queued up.

Depends on 
[PATCH 0/2] hw/cxl: CDAT file handling fixes.
[PATCH v2 0/3] hw/cxl: Fix decoder commit and uncommit handling
[PATCH 0/3] docs/cxl: Gathering of fixes for 8.0 CXL docs.
[PATCH v5 0/3] hw/mem: CXL Type-3 Volatile Memory Support
 
Based on: Message-ID: 20230421132020.7408-1-Jonathan.Cameron@huawei.com
Based on: Message-ID: 20230421135906.3515-1-Jonathan.Cameron@huawei.com
Based on: Message-ID: 20230421134507.26842-1-Jonathan.Cameron@huawei.com
Based on: Message-ID: 20230421160827.2227-1-Jonathan.Cameron@huawei.com

The kernel support for Poison handling is currently in the cxl/pending
branch and hopefully should be in the CXL pull request next week.

https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=pending

This code has been very useful for testing and helped identify various
corner cases.

Updated cover letter.

The series supports:
1) Injection of variable length poison regions via QMP (to fake real
   memory corruption and ensure we deal with odd overflow corner cases
   such as clearing the middle of a large region making the list overflow
   as we go from one long entry to two smaller entries.
2) Read of poison list via the CXL mailbox.
3) Injection via the poison injection mailbox command (limited to 64 byte
   entries - spec constraint)
4) Clearing of poison injected via either method.

The implementation is meant to be a valid combination of impdef choices
based on what the spec allowed. There are a number of places where it could
be made more sophisticated that we might consider in future:
* Fusing adjacent poison entries if the types match.
* Separate injection list and main poison list, to test out limits on
  injected poison list being smaller than the main list.
* Poison list overflow event (needs event log support in general)
* Connecting up to the poison list error record generation (rather complex
  and not needed for currently kernel handling testing).
* Triggering the synchronous and asynchronous errors that occur on reads
  and writes of the memory when the host receives poison.

As the kernel code is currently fairly simple, it is likely that the above
does not yet matter but who knows what will turn up in future!


Ira Weiny (1):
  bswap: Add the ability to store to an unaligned 24 bit field

Jonathan Cameron (3):
  hw/cxl: QMP based poison injection support
  hw/cxl: Add poison injection via the mailbox.
  hw/cxl: Add clear poison mailbox command support.

 docs/devel/loads-stores.rst |   1 +
 hw/cxl/cxl-mailbox-utils.c  | 214 ++++++++++++++++++++++++++++++++++++
 hw/mem/cxl_type3.c          |  93 ++++++++++++++++
 hw/mem/cxl_type3_stubs.c    |   6 +
 include/hw/cxl/cxl.h        |   1 +
 include/hw/cxl/cxl_device.h |  21 ++++
 include/qemu/bswap.h        |  27 +++++
 qapi/cxl.json               |  18 +++
 8 files changed, 381 insertions(+)