Message ID | 20241017160937.2283225-3-kbusch@meta.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | write hints for nvme fdp | expand |
On Thu, Oct 17, 2024 at 09:09:33AM -0700, Keith Busch wrote: > From: Keith Busch <kbusch@kernel.org> > > This is still backwards compatible with lifetime hints. It just doesn't > constrain the hints to that definition. So in the end we'll end up with two uses of it - the existing 5 temperature hints and the new stream separation. I think it would be cleaner to make it a union, but I don't care that strongly. But we probably want a way to distinguish which one is supported. E.g. for SCSI we set a net BLK_FEAT_WRITE_HINTS, for NVMe we'll set BLK_FEAT_STREAM_SEPARATION. Either way this should probably be the first patch in the series.
On 10/17/24 18:09, Keith Busch wrote: > From: Keith Busch <kbusch@kernel.org> > > This is still backwards compatible with lifetime hints. It just doesn't > constrain the hints to that definition. > > Signed-off-by: Keith Busch <kbusch@kernel.org> > --- > include/linux/blk-mq.h | 3 +-- > include/linux/blk_types.h | 2 +- > 2 files changed, 2 insertions(+), 3 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes
On 10/17/24 9:09 AM, Keith Busch wrote: > @@ -156,7 +155,7 @@ struct request { > struct blk_crypto_keyslot *crypt_keyslot; > #endif > > - enum rw_hint write_hint; > + unsigned short write_hint; Why 'u16' for ki_write_hint and 'unsigned short' for write_hint? Isn't that inconsistent? Thanks, Bart.
On Fri, Oct 18, 2024 at 09:12:20AM -0700, Bart Van Assche wrote: > On 10/17/24 9:09 AM, Keith Busch wrote: > > @@ -156,7 +155,7 @@ struct request { > > struct blk_crypto_keyslot *crypt_keyslot; > > #endif > > - enum rw_hint write_hint; > > + unsigned short write_hint; > > Why 'u16' for ki_write_hint and 'unsigned short' for write_hint? Isn't > that inconsistent? It's consistent with the local convetion of the existing structs. Some use unsiged short, others use u16. It looks weird to mix the types within the same struct.
On Fri, Oct 18, 2024 at 07:46:44AM +0200, Christoph Hellwig wrote: > On Thu, Oct 17, 2024 at 09:09:33AM -0700, Keith Busch wrote: > > From: Keith Busch <kbusch@kernel.org> > > > > This is still backwards compatible with lifetime hints. It just doesn't > > constrain the hints to that definition. > > So in the end we'll end up with two uses of it - the existing 5 > temperature hints and the new stream separation. I think it > would be cleaner to make it a union, but I don't care that > strongly. > > But we probably want a way to distinguish which one is supported. > > E.g. for SCSI we set a net BLK_FEAT_WRITE_HINTS, for NVMe we'll set > BLK_FEAT_STREAM_SEPARATION. > > Either way this should probably be the first patch in the series. I'm not sure I follow this feedback. The SCSI feature is defined as a lifetime stream association in SBC-5. So it's still a stream for SCSI, but you want to call it "WRITE_HINT", which is not a term used in the SCSI spec for this feature. But, you want to call it STREAM_SEPARATION for NVMe only, even though the FDP spec doesn't use that term? What's wrong with just calling it a generic hint support feature? I also don't see why SCSI couldn't use per-io hints just like this enables for NVMe. The spec doesn't limit SCSI to just 5 streams, so this provides a way to access them all through the raw block device.
On Thu, Oct 24, 2024 at 01:45:02PM -0600, Keith Busch wrote: > I'm not sure I follow this feedback. The SCSI feature is defined as a > lifetime stream association in SBC-5. So it's still a stream for SCSI, > but you want to call it "WRITE_HINT", which is not a term used in the > SCSI spec for this feature. But, you want to call it STREAM_SEPARATION > for NVMe only, even though the FDP spec doesn't use that term? What's > wrong with just calling it a generic hint support feature? The "Constrained Streams with Data Lifetimes" are called streams for political reasons but they are not. They are buckets of different data lifetimes. > I also don't see why SCSI couldn't use per-io hints just like this > enables for NVMe. The spec doesn't limit SCSI to just 5 streams, so this > provides a way to access them all through the raw block device. I don't mind passing per-I/O temperature hints to SCSI block devices. But we should not confuse a streams/FDP like streams that are different context which are assumed to be discarded together and have a concept of Stream Granularity Size or Reclaim Unit size with the data temperature hints that are at the storage level fundamentally per-I/O and just bucket into temperature group without any indication of data locality.
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 59e9adf815a49..bf007a4081d9b 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -8,7 +8,6 @@ #include <linux/scatterlist.h> #include <linux/prefetch.h> #include <linux/srcu.h> -#include <linux/rw_hint.h> struct blk_mq_tags; struct blk_flush_queue; @@ -156,7 +155,7 @@ struct request { struct blk_crypto_keyslot *crypt_keyslot; #endif - enum rw_hint write_hint; + unsigned short write_hint; unsigned short ioprio; enum mq_rq_state state; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index dce7615c35e7e..56b7fb961e0c7 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -219,7 +219,7 @@ struct bio { */ unsigned short bi_flags; /* BIO_* below */ unsigned short bi_ioprio; - enum rw_hint bi_write_hint; + unsigned short bi_write_hint; blk_status_t bi_status; atomic_t __bi_remaining;