mbox series

[0/4] hw/nvme: add support for TP4084

Message ID 20220608012850.668695-1-niklas.cassel@wdc.com (mailing list archive)
Headers show
Series hw/nvme: add support for TP4084 | expand

Message

Niklas Cassel June 8, 2022, 1:28 a.m. UTC
Hello there,

considering that Linux v5.19-rc1 is out which includes support for
NVMe TP4084:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/nvme/host/core.c?id=354201c53e61e493017b15327294b0c8ab522d69

I thought that it might be nice to have QEMU support for the same.

TP4084 adds a new mode, CC.CRIME, that can be used to mark a namespace
as ready independently from the controller.

When CC.CRIME is 0 (default), things behave as before, all namespaces
are ready when CSTS.RDY gets set to 1.

Add a new "ready_delay" namespace device parameter, in order to emulate
different ready latencies for namespaces when CC.CRIME is 1.

The patch series also adds a "crwmt" controller parameter, in order to
be able to expose the worst case timeout that the host should wait for
all namespaces to become ready.


Example qemu cmd line for the new options:

# delay in s (20s)
NS1_DELAY_S=20
# convert to units of 500ms
NS1_DELAY=$((NS1_DELAY_S*2))

# delay in s (60s)
NS2_DELAY_S=60
# convert to units of 500ms
NS2_DELAY=$((NS2_DELAY_S*2))

# timeout in s (120s)
CRWMT_S=120
# convert to units of 500ms
CRWMT=$((CRWMT_S*2))

             -device nvme,serial=deadbeef,crwmt=$CRWMT \
             -drive file=$NS1_DATA,id=nvm-1,format=raw,if=none \
             -device nvme-ns,drive=nvm-1,ready_delay=$NS1_DELAY \
             -drive file=$NS2_DATA,id=nvm-2,format=raw,if=none \
             -device nvme-ns,drive=nvm-2,ready_delay=$NS2_DELAY \


Niklas Cassel (4):
  hw/nvme: claim NVMe 2.0 compliance
  hw/nvme: store a pointer to the NvmeSubsystem in the NvmeNamespace
  hw/nvme: add support for ratified TP4084
  hw/nvme: add new never_ready parameter to test the DNR bit

 hw/nvme/ctrl.c       | 151 +++++++++++++++++++++++++++++++++++++++++--
 hw/nvme/ns.c         |  17 +++++
 hw/nvme/nvme.h       |   9 +++
 hw/nvme/trace-events |   1 +
 include/block/nvme.h |  60 ++++++++++++++++-
 5 files changed, 233 insertions(+), 5 deletions(-)

Comments

Klaus Jensen June 16, 2022, 10:42 a.m. UTC | #1
On Jun  8 03:28, Niklas Cassel via wrote:
> Hello there,
> 
> considering that Linux v5.19-rc1 is out which includes support for
> NVMe TP4084:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/nvme/host/core.c?id=354201c53e61e493017b15327294b0c8ab522d69
> 
> I thought that it might be nice to have QEMU support for the same.
> 
> TP4084 adds a new mode, CC.CRIME, that can be used to mark a namespace
> as ready independently from the controller.
> 
> When CC.CRIME is 0 (default), things behave as before, all namespaces
> are ready when CSTS.RDY gets set to 1.
> 
> Add a new "ready_delay" namespace device parameter, in order to emulate
> different ready latencies for namespaces when CC.CRIME is 1.
> 
> The patch series also adds a "crwmt" controller parameter, in order to
> be able to expose the worst case timeout that the host should wait for
> all namespaces to become ready.
> 
> 
> Example qemu cmd line for the new options:
> 
> # delay in s (20s)
> NS1_DELAY_S=20
> # convert to units of 500ms
> NS1_DELAY=$((NS1_DELAY_S*2))
> 
> # delay in s (60s)
> NS2_DELAY_S=60
> # convert to units of 500ms
> NS2_DELAY=$((NS2_DELAY_S*2))
> 
> # timeout in s (120s)
> CRWMT_S=120
> # convert to units of 500ms
> CRWMT=$((CRWMT_S*2))
> 
>              -device nvme,serial=deadbeef,crwmt=$CRWMT \
>              -drive file=$NS1_DATA,id=nvm-1,format=raw,if=none \
>              -device nvme-ns,drive=nvm-1,ready_delay=$NS1_DELAY \
>              -drive file=$NS2_DATA,id=nvm-2,format=raw,if=none \
>              -device nvme-ns,drive=nvm-2,ready_delay=$NS2_DELAY \
> 
> 
> Niklas Cassel (4):
>   hw/nvme: claim NVMe 2.0 compliance
>   hw/nvme: store a pointer to the NvmeSubsystem in the NvmeNamespace
>   hw/nvme: add support for ratified TP4084
>   hw/nvme: add new never_ready parameter to test the DNR bit
> 
>  hw/nvme/ctrl.c       | 151 +++++++++++++++++++++++++++++++++++++++++--
>  hw/nvme/ns.c         |  17 +++++
>  hw/nvme/nvme.h       |   9 +++
>  hw/nvme/trace-events |   1 +
>  include/block/nvme.h |  60 ++++++++++++++++-
>  5 files changed, 233 insertions(+), 5 deletions(-)
> 
> -- 
> 2.36.1
> 
> 

Hi Niklas,

I've been going back and forth on my position on this.

I'm not straight up against it, but this only seems useful as a one-off
patch to test the kernel support for this. Considering the limitations
you state and the limited use case, I fear this is a little bloaty to
carry upstream.

But I totally acknowledge that this is a horrible complicated behavior
to implement on the driver side, so I guess we might all benefit from
this.

Keith, do you have an opinion on this?
Keith Busch June 16, 2022, 1:57 p.m. UTC | #2
On Thu, Jun 16, 2022 at 12:42:49PM +0200, Klaus Jensen wrote:
> On Jun  8 03:28, Niklas Cassel via wrote:
> > Hello there,
> > 
> > considering that Linux v5.19-rc1 is out which includes support for
> > NVMe TP4084:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/nvme/host/core.c?id=354201c53e61e493017b15327294b0c8ab522d69
> > 
> > I thought that it might be nice to have QEMU support for the same.
> > 
> > TP4084 adds a new mode, CC.CRIME, that can be used to mark a namespace
> > as ready independently from the controller.
> > 
> > When CC.CRIME is 0 (default), things behave as before, all namespaces
> > are ready when CSTS.RDY gets set to 1.
> > 
> > Add a new "ready_delay" namespace device parameter, in order to emulate
> > different ready latencies for namespaces when CC.CRIME is 1.
> > 
> > The patch series also adds a "crwmt" controller parameter, in order to
> > be able to expose the worst case timeout that the host should wait for
> > all namespaces to become ready.
> > 
> > 
> > Example qemu cmd line for the new options:
> > 
> > # delay in s (20s)
> > NS1_DELAY_S=20
> > # convert to units of 500ms
> > NS1_DELAY=$((NS1_DELAY_S*2))
> > 
> > # delay in s (60s)
> > NS2_DELAY_S=60
> > # convert to units of 500ms
> > NS2_DELAY=$((NS2_DELAY_S*2))
> > 
> > # timeout in s (120s)
> > CRWMT_S=120
> > # convert to units of 500ms
> > CRWMT=$((CRWMT_S*2))
> > 
> >              -device nvme,serial=deadbeef,crwmt=$CRWMT \
> >              -drive file=$NS1_DATA,id=nvm-1,format=raw,if=none \
> >              -device nvme-ns,drive=nvm-1,ready_delay=$NS1_DELAY \
> >              -drive file=$NS2_DATA,id=nvm-2,format=raw,if=none \
> >              -device nvme-ns,drive=nvm-2,ready_delay=$NS2_DELAY \
> > 
> > 
> > Niklas Cassel (4):
> >   hw/nvme: claim NVMe 2.0 compliance
> >   hw/nvme: store a pointer to the NvmeSubsystem in the NvmeNamespace
> >   hw/nvme: add support for ratified TP4084
> >   hw/nvme: add new never_ready parameter to test the DNR bit
> > 
> >  hw/nvme/ctrl.c       | 151 +++++++++++++++++++++++++++++++++++++++++--
> >  hw/nvme/ns.c         |  17 +++++
> >  hw/nvme/nvme.h       |   9 +++
> >  hw/nvme/trace-events |   1 +
> >  include/block/nvme.h |  60 ++++++++++++++++-
> >  5 files changed, 233 insertions(+), 5 deletions(-)
> > 
> > -- 
> > 2.36.1
> > 
> > 
> 
> Hi Niklas,
> 
> I've been going back and forth on my position on this.
> 
> I'm not straight up against it, but this only seems useful as a one-off
> patch to test the kernel support for this. Considering the limitations
> you state and the limited use case, I fear this is a little bloaty to
> carry upstream.
> 
> But I totally acknowledge that this is a horrible complicated behavior
> to implement on the driver side, so I guess we might all benefit from
> this.
> 
> Keith, do you have an opinion on this?

There are drivers utilizing the capability, so I think supporting it is fine
despite this environment not having any inherent time-to-ready delays.

This will probably be another knob that gets lots of use for initial driver
validation, then largely forgotten. But maybe it will be useful for driver unit
and regression testing in the future.
Klaus Jensen June 16, 2022, 4:08 p.m. UTC | #3
On Jun 16 07:57, Keith Busch wrote:
> On Thu, Jun 16, 2022 at 12:42:49PM +0200, Klaus Jensen wrote:
> > On Jun  8 03:28, Niklas Cassel via wrote:
> > > Hello there,
> > > 
> > > considering that Linux v5.19-rc1 is out which includes support for
> > > NVMe TP4084:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/nvme/host/core.c?id=354201c53e61e493017b15327294b0c8ab522d69
> > > 
> > > I thought that it might be nice to have QEMU support for the same.
> > > 
> > > TP4084 adds a new mode, CC.CRIME, that can be used to mark a namespace
> > > as ready independently from the controller.
> > > 
> > > When CC.CRIME is 0 (default), things behave as before, all namespaces
> > > are ready when CSTS.RDY gets set to 1.
> > > 
> > > Add a new "ready_delay" namespace device parameter, in order to emulate
> > > different ready latencies for namespaces when CC.CRIME is 1.
> > > 
> > > The patch series also adds a "crwmt" controller parameter, in order to
> > > be able to expose the worst case timeout that the host should wait for
> > > all namespaces to become ready.
> > > 
> > > 
> > > Example qemu cmd line for the new options:
> > > 
> > > # delay in s (20s)
> > > NS1_DELAY_S=20
> > > # convert to units of 500ms
> > > NS1_DELAY=$((NS1_DELAY_S*2))
> > > 
> > > # delay in s (60s)
> > > NS2_DELAY_S=60
> > > # convert to units of 500ms
> > > NS2_DELAY=$((NS2_DELAY_S*2))
> > > 
> > > # timeout in s (120s)
> > > CRWMT_S=120
> > > # convert to units of 500ms
> > > CRWMT=$((CRWMT_S*2))
> > > 
> > >              -device nvme,serial=deadbeef,crwmt=$CRWMT \
> > >              -drive file=$NS1_DATA,id=nvm-1,format=raw,if=none \
> > >              -device nvme-ns,drive=nvm-1,ready_delay=$NS1_DELAY \
> > >              -drive file=$NS2_DATA,id=nvm-2,format=raw,if=none \
> > >              -device nvme-ns,drive=nvm-2,ready_delay=$NS2_DELAY \
> > > 
> > > 
> > > Niklas Cassel (4):
> > >   hw/nvme: claim NVMe 2.0 compliance
> > >   hw/nvme: store a pointer to the NvmeSubsystem in the NvmeNamespace
> > >   hw/nvme: add support for ratified TP4084
> > >   hw/nvme: add new never_ready parameter to test the DNR bit
> > > 
> > >  hw/nvme/ctrl.c       | 151 +++++++++++++++++++++++++++++++++++++++++--
> > >  hw/nvme/ns.c         |  17 +++++
> > >  hw/nvme/nvme.h       |   9 +++
> > >  hw/nvme/trace-events |   1 +
> > >  include/block/nvme.h |  60 ++++++++++++++++-
> > >  5 files changed, 233 insertions(+), 5 deletions(-)
> > > 
> > > -- 
> > > 2.36.1
> > > 
> > > 
> > 
> > Hi Niklas,
> > 
> > I've been going back and forth on my position on this.
> > 
> > I'm not straight up against it, but this only seems useful as a one-off
> > patch to test the kernel support for this. Considering the limitations
> > you state and the limited use case, I fear this is a little bloaty to
> > carry upstream.
> > 
> > But I totally acknowledge that this is a horrible complicated behavior
> > to implement on the driver side, so I guess we might all benefit from
> > this.
> > 
> > Keith, do you have an opinion on this?
> 
> There are drivers utilizing the capability, so I think supporting it is fine
> despite this environment not having any inherent time-to-ready delays.
> 
> This will probably be another knob that gets lots of use for initial driver
> validation, then largely forgotten. But maybe it will be useful for driver unit
> and regression testing in the future.

Alright, sounds good. I'll give it a proper review :)