mbox series

[RFC,v2,0/3] cxl: Multi-headed Single Logical Device (MHSLD)

Message ID 20240424-cxl-2024-04-22-pullahead-v2-0-a87fc4597795@memverge.com
Headers show
Series cxl: Multi-headed Single Logical Device (MHSLD) | expand

Message

Svetly Todorov April 24, 2024, 8:04 p.m. UTC
MHSLDs allow multiple hosts to access dynamic capacity on a single
backing device. This complicates DC management because adds,
removals, and accesses need to be vetted such that hosts don't
stomp on each other's data.

This patchset proposes a set of hooks to be called in cxl_type3.c
when each of the above events happens. The results of the hooks
can be used to prevent illegal DC operations in the corresponding
cxl_* functions. 

It also proposes an implementation for MHSLD emulation. The device's
state is stored in a shared memory-backed file (on Linux, under /dev/shm).
Extent ownership is tracked in a bytemap, wherein every byte acts
as a set of flags showing which host ID can access the extent at that
index. Blocks in an extent are claimed via compare-and-swap. Failure to
claim the entirety of an extent is treated as a failure; the host will
parse back over any blocks claimed so far and mask out its corresponding
ownership bit. Operations on block states are done via atomics. This
allows claims on unrelated extents to be serviced concurrently.

Block state is relegated to a u8 to conserve space.
Therefore only 8 heads are currently supported.

The MHSLD device inherits from the CXL_TYPE3 class and adds the following
configuration options:
--mhd-head=<u32>
--mhd-state_file=<str>
--mhd-init=<bool>

--mhd-head specifies the head ID of the host on the given device.

--mhd-state_file is the name of the shared-memory-backed file used
to store the MHD state.

--mhd-init indicates whether this QEMU instance should initialize
the state_file; if so, the instance will create the file if it does
not exist, ftruncate it to the appropriate size, and initialize its
header. It is assumed that the --mhd-init instance is run and allowed
to completely finish configuration before any other guests access the
shared state.

The shared state file only needs to be intialized once. Even if a guest
dies without clearing the ownership bits associated with its head-ID,
future guests with that ID will clear those bits in cxl_mhsld_realize(),
regardless of whether mhd_init is true or false.

The following command line options create an MHSLD with 4GB of
backing memory, whose state is tracked in /dev/shm/mhd_metadata.
--mhd-init=true tells this instance to initialize the file as
described above.

./qemu-system_x86-64 \
[... other options ...] \
-device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \
-device cxl-rp,id=rp0,bus=cxl.0,chassis=0,port=0,slot=0 \
-object memory-backend-ram,id=mem0,size=4G \
-device cxl-mhsld,bus=rp0,num-dc-regions=1,volatile-dc-memdev=mem0,id=cxl-mem0,sn=66667,mhd-head=0,mhd-state_file=mhd_metadata,mhd-init=true \
-M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=4G \
-qmp unix:/tmp/qmp-sock-1,server,nowait

Once this guest completes setup, other guests looking to access the
device can be booted with the same configuration options, but with
--mhd-head != 0,
--mhd-init=false,
and a different QMP socket.

The first patch in the series moves the CXLUpdateDCExtentListInPl
struct to a shared header. This allows DC-management code outside
of cxl_type3.c (like mhsld.c) to see which extents have been
accepted by a host in its DC-add response.

Signed-off-by: Svetly Todorov <svetly.todorov@memverge.com>
Signed-off-by: Gregory Price <gregory.price@memverge.com

Changes in v2:
- remove explicit init_mhsld script
- implement remaining DC add/remove logic
- link to v1: https://lore.kernel.org/qemu-devel/20240308185211.277116-1-gregory.price@memverge.com/

---
Svetly Todorov (3):
      cxl-mailbox-utils: move CXLUpdateDCExtentListInPl into header
      cxl_type3: add MHD callbacks
      mhsld: implement MHSLD device

 hw/cxl/Kconfig              |   1 +
 hw/cxl/cxl-mailbox-utils.c  |  44 +++--
 hw/cxl/meson.build          |   1 +
 hw/cxl/mhsld/Kconfig        |   4 +
 hw/cxl/mhsld/meson.build    |   3 +
 hw/cxl/mhsld/mhsld.c        | 456 ++++++++++++++++++++++++++++++++++++++++++++
 hw/cxl/mhsld/mhsld.h        |  75 ++++++++
 hw/mem/cxl_type3.c          |  17 ++
 include/hw/cxl/cxl_device.h |  24 +++
 9 files changed, 608 insertions(+), 17 deletions(-)
---
base-commit: 532d3bf76f73b8edafd3333dacae449b7c9bb61d
change-id: 20240422-cxl-2024-04-22-pullahead-fd6fd9e6d1e2

Best regards,

Comments

Jonathan Cameron June 21, 2024, 5:28 p.m. UTC | #1
On Wed, 24 Apr 2024 13:04:05 -0700
Svetly Todorov <svetly.todorov@memverge.com> wrote:

> MHSLDs allow multiple hosts to access dynamic capacity on a single
> backing device. This complicates DC management because adds,
> removals, and accesses need to be vetted such that hosts don't
> stomp on each other's data.
> 
> This patchset proposes a set of hooks to be called in cxl_type3.c
> when each of the above events happens. The results of the hooks
> can be used to prevent illegal DC operations in the corresponding
> cxl_* functions. 

Hi,

I had a quick go at applying this to my gitlab tree cxl staging tree
to make it more generally available.  It unfortunately needs some updates
for the change to Extent Groups in the most recent DCD code.

Whilst I can probably sort that, it's going to happen particularly soon.

I'll be pushing a new tree shortly if you have time to rebase.

Jonathan