mbox series

[0/1] hw/nvme: add atomic write support

Message ID 20240820161123.316887-1-alan.adamson@oracle.com (mailing list archive)
Headers show
Series hw/nvme: add atomic write support | expand

Message

Alan Adamson Aug. 20, 2024, 4:11 p.m. UTC
Since there is work in the Linux NVMe Driver community to add Atomic Write
support, it would be desirable to be able to test it with qemu nvme emulation.
 
This patch will focus on supporting NVMe controller atomic write parameters (AWUN and
AWUPF) but can be extended to support Namespace parameters (NAWUN and NAWUPF)
and Boundaries (NABSN, NABO, and NABSPF).
 
Atomic Write Parameters for NVMe QEMU
-------------------------------------
New NVMe QEMU Parameters (See NVMe Specification for details):
        atomic.dn (default off) - Set the value of Disable Normal.
        atomic.awun=UINT16 (default: 0)
        atomic.awupf=UINT16 (default: 0)
 
qemu command line example:
        qemu-system-x86_64 -cpu host --enable-kvm -smp cpus=4 -no-reboot -m 8192M -drive file=./disk.img,if=ide \
        -boot c -device e1000,netdev=net0,mac=DE:CC:CC:EF:99:88 -netdev tap,id=net0 \
	-device nvme,id=nvme-ctrl-0,serial=nvme-1,atomic.dn=off,atomic.awun=15,atomic.awupf=7 \
        -drive file=./nvme.img,if=none,id=nvm-1 -device nvme-ns,drive=nvm-1,bus=nvme-ctrl-0 nvme-ns,drive=nvm-1,bus=nvme-ctrl-0
 
Making Writes Atomic:
---------------------
Currently, as the nvme emulator walks through the Submission Queue (SQ)
(nvme_process_sq()), it takes each request (read/write/etc) off the SQ and starts its
execution and then continues on with the next SQ entry until all entries are started. It
is likely, multiple requests (from multiple SQs) will be executing in parallel and acting
on a common LBA range.  This prevents writes from completing atomically. When a write
completes atomically, either all or none of the LBAs will be committed to media.  This
means writes to a common LBA range can not be done in parallel if writes are going to
be atomic. The nvme emulator does not currently guarantee this and LBAs
from multiple requests may get committed.  The fio test shown below, comfirms this.
 
Prior to taking a command off of a SQ, a check needs to be done to determine if it
conflicts atomically with a currently executing command.
 
bool nvme_atomic_write_check() - Checks a NVMe command to determine if it can be started,
or if it conflicts atomically with a currently executing command.
 
Returns:   NVME_ATOMIC_NO_START - The command atomically conflicts with a currently
           executing command and can not be started.
 
           NVME_ATOMIC_START_ATOMIC  - The command is an atomic write, does not
           conflict atomically with a currently executing command, and can be started.
 
           NVME_ATOMIC_START_NONATOMIC - The command is not an atomic write, but it
           can be started.

If a command is blocked from being started, nvme_process_sq() needs to be rescheduled.
 
Implementation:
---------------
Each SQ maintains a list of executing requests (sq->out_req_list). When a command is
taken off the SQ to start executing it, it is placed on out_req_list and removed when
the command completes and placed on the Completion Queue (CQ). When nvme_process_sq()
is executing and looking to take a command off the SQ, nvme_atomic_write_check() is
called to determine if it is atomically safe to start executing the command. If it is
safe, nvme_atomic_write_check() will return NVME_ATOMIC_START_ATOMIC or
NVME_ATOMIC_START_NONATOMIC. nvme_process_sq() then pulls the command off the SQ,
places an associated request onto out_req_list. If it is not atomically safe,
(nvme_atomic_write_check() returns NVME_ATOMIC_NO_START). The command remains on the SQ,
and processing of that SQ stops and nvme_process_sq() will be rescheduled.
When nvme_atomic_write_check() is called, the out_req_list for each SQ is walked and the
LBA range of the command to be started is compared with each executing request.

What is the Maximum Atomic Write Size?
--------------------------------------
By default the qemu parameter atomic.awun specifices that maximum atomic write size which
will be used by maximum atomic Write size. If Disable Normal is set to true with qemu
parameter atomic.dn or with the SET FEATURE command, the atomic.awupf value will specify
the maximum atomic write size.

Testing
-------
NVMe QEMU Parameters used: atomic.dn=off,atomic.awun=63,atomic.awupf=63
 
# nvme id-ctrl /dev/nvme0 | grep awun
awun      : 15
# nvme id-ctrl /dev/nvme0 | grep awupf
awupf     : 7
# nvme id-ctrl /dev/nvme0 | grep acwu
acwu      : 0    < Since qemu-nvme doesn't support Compare and Write, this is always zero
# nvme get-feature /dev/nvme0  -f 0xa
get-feature:0x0a (Write Atomicity Normal), Current value:00000000
#
 
# fio --filename=/dev/nvme0n1 --direct=1 --rw=randwrite --bs=32k --iodepth=256 --name=iops --numjobs=50 --verify=crc64 --verify_fatal=1 --ioengine=libaio
 
When executed without atomic write support, eventually the following error will be
observed:
        crc64: verify failed at file /dev/nvme0n1 offset 857669632, length 32768
(requested block: offset=857669632, length=32768, flags=88)
            Expected CRC: 9c87d3539dafdca0
            Received CRC: d521f7ea3b69d2ee
 
When executed with atomic write support, this error no longer happens.
 
Future Work
-----------
- Namespace support (NAWUN, NAWUPF and NACWU)
- Namespace Boundary support (NABSN, NABO, and NABSPF)
- Atomic Compare and Write Unit (ACWU)


Alan Adamson (1):
  hw/nvme: add atomic write support

 hw/nvme/ctrl.c | 161 +++++++++++++++++++++++++++++++++++++++++++++++++
 hw/nvme/nvme.h |  12 ++++
 2 files changed, 173 insertions(+)

Comments

Klaus Jensen Sept. 17, 2024, 7:59 a.m. UTC | #1
On Aug 20 09:11, Alan Adamson wrote:
> Since there is work in the Linux NVMe Driver community to add Atomic Write
> support, it would be desirable to be able to test it with qemu nvme emulation.
>  
> This patch will focus on supporting NVMe controller atomic write parameters (AWUN and
> AWUPF) but can be extended to support Namespace parameters (NAWUN and NAWUPF)
> and Boundaries (NABSN, NABO, and NABSPF).
>  
 
Hi Alan,

I am trying to test this with John's atomic-writes-v6.10-v9 linux
branch, but that does not seem to work for me.

Do I need anything else?
Alan Adamson Sept. 17, 2024, 4:21 p.m. UTC | #2
On 9/17/24 12:59 AM, Klaus Jensen wrote:
> On Aug 20 09:11, Alan Adamson wrote:
>> Since there is work in the Linux NVMe Driver community to add Atomic Write
>> support, it would be desirable to be able to test it with qemu nvme emulation.
>>   
>> This patch will focus on supporting NVMe controller atomic write parameters (AWUN and
>> AWUPF) but can be extended to support Namespace parameters (NAWUN and NAWUPF)
>> and Boundaries (NABSN, NABO, and NABSPF).
>>   
>   
> Hi Alan,
>
> I am trying to test this with John's atomic-writes-v6.10-v9 linux
> branch, but that does not seem to work for me.
>
> Do I need anything else?

Hi Klaus,

What  are you trying to test?

You can see if it is being setup properly with:

[root@localhost ~]# nvme id-ctrl /dev/nvme0 | grep awupf
awupf     : 31
[root@localhost ~]#  nvme id-ctrl /dev/nvme0 | grep awun
awun      : 63
[root@localhost ~]#

With or without John's atomic support, for this case, 32k writes will be 
atomic while 64k writes will not be. This can be validated with fio 
since corruption is observed when using 64k writes.

Alan
Alan Adamson Sept. 17, 2024, 4:38 p.m. UTC | #3
On 9/17/24 9:21 AM, alan.adamson@oracle.com wrote:
>
> On 9/17/24 12:59 AM, Klaus Jensen wrote:
>> On Aug 20 09:11, Alan Adamson wrote:
>>> Since there is work in the Linux NVMe Driver community to add Atomic 
>>> Write
>>> support, it would be desirable to be able to test it with qemu nvme 
>>> emulation.
>>>   This patch will focus on supporting NVMe controller atomic write 
>>> parameters (AWUN and
>>> AWUPF) but can be extended to support Namespace parameters (NAWUN 
>>> and NAWUPF)
>>> and Boundaries (NABSN, NABO, and NABSPF).
>>   Hi Alan,
>>
>> I am trying to test this with John's atomic-writes-v6.10-v9 linux
>> branch, but that does not seem to work for me.
>>
>> Do I need anything else?
>
> Hi Klaus,
>
> What  are you trying to test?
>
> You can see if it is being setup properly with:
>
> [root@localhost ~]# nvme id-ctrl /dev/nvme0 | grep awupf
> awupf     : 31
> [root@localhost ~]#  nvme id-ctrl /dev/nvme0 | grep awun
> awun      : 63
> [root@localhost ~]#
>
> With or without John's atomic support, for this case, 32k writes will 
> be atomic while 64k writes will not be. This can be validated with fio 
> since corruption is observed when using 64k writes.
>
> Alan

BTW, I'm going to send out a v2 of the patch that includes your suggestions.

Alan