Message ID | 20230706130930.64283-2-nmi@metaspace.dk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ublk: enable zoned storage support | expand |
On 7/6/23 22:09, Andreas Hindborg wrote: > From: Andreas Hindborg <a.hindborg@samsung.com> > > Ublk zoned storage support relies on DRV_IN handling for zone report. > Prepare for this change by adding offsets for the DRV_IN/DRV_OUT commands. > > Also add parenthesis to existing opcodes for better macro hygiene. > > Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com> > --- > include/uapi/linux/ublk_cmd.h | 18 ++++++++++++++---- > 1 file changed, 14 insertions(+), 4 deletions(-) > > diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h > index 4b8558db90e1..2ebb8d5d827a 100644 > --- a/include/uapi/linux/ublk_cmd.h > +++ b/include/uapi/linux/ublk_cmd.h > @@ -229,12 +229,22 @@ struct ublksrv_ctrl_dev_info { > __u64 reserved2; > }; > > -#define UBLK_IO_OP_READ 0 > +#define UBLK_IO_OP_READ 0 > #define UBLK_IO_OP_WRITE 1 > #define UBLK_IO_OP_FLUSH 2 > -#define UBLK_IO_OP_DISCARD 3 > -#define UBLK_IO_OP_WRITE_SAME 4 > -#define UBLK_IO_OP_WRITE_ZEROES 5 > +#define UBLK_IO_OP_DISCARD 3 > +#define UBLK_IO_OP_WRITE_SAME 4 > +#define UBLK_IO_OP_WRITE_ZEROES 5 > +/* > + * Passthrough (driver private) operation codes range between This is unclear... Here, what does "driver" refer to ? If it is the ublk kernel driver, than these commands should not be defined in the uapi header file, they should be defined in drivers/block/ublk.h. However, if these are for the user space driver, like all the other operations, then let's clearly state so. But then, I still not understand why these need a different naming pattern using the "__UBLK" prefix... > + * __UBLK_IO_OP_DRV_IN_START and __UBLK_IO_OP_DRV_IN_END for REQ_OP_DRV_IN type > + * operations and between __UBLK_IO_OP_DRV_OUT_START and > + * __UBLK_IO_OP_DRV_OUT_END for REQ_OP_DRV_OUT type operations. > + */ > +#define __UBLK_IO_OP_DRV_IN_START 32 > +#define __UBLK_IO_OP_DRV_IN_END 96 > +#define __UBLK_IO_OP_DRV_OUT_START __UBLK_IO_OP_DRV_IN_END > +#define __UBLK_IO_OP_DRV_OUT_END 160 > > #define UBLK_IO_F_FAILFAST_DEV (1U << 8) > #define UBLK_IO_F_FAILFAST_TRANSPORT (1U << 9)
On Fri, Jul 07, 2023 at 08:50:01AM +0900, Damien Le Moal wrote: > On 7/6/23 22:09, Andreas Hindborg wrote: > > From: Andreas Hindborg <a.hindborg@samsung.com> > > > > Ublk zoned storage support relies on DRV_IN handling for zone report. > > Prepare for this change by adding offsets for the DRV_IN/DRV_OUT commands. > > > > Also add parenthesis to existing opcodes for better macro hygiene. > > > > Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com> > > --- > > include/uapi/linux/ublk_cmd.h | 18 ++++++++++++++---- > > 1 file changed, 14 insertions(+), 4 deletions(-) > > > > diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h > > index 4b8558db90e1..2ebb8d5d827a 100644 > > --- a/include/uapi/linux/ublk_cmd.h > > +++ b/include/uapi/linux/ublk_cmd.h > > @@ -229,12 +229,22 @@ struct ublksrv_ctrl_dev_info { > > __u64 reserved2; > > }; > > > > -#define UBLK_IO_OP_READ 0 > > +#define UBLK_IO_OP_READ 0 > > #define UBLK_IO_OP_WRITE 1 > > #define UBLK_IO_OP_FLUSH 2 > > -#define UBLK_IO_OP_DISCARD 3 > > -#define UBLK_IO_OP_WRITE_SAME 4 > > -#define UBLK_IO_OP_WRITE_ZEROES 5 > > +#define UBLK_IO_OP_DISCARD 3 > > +#define UBLK_IO_OP_WRITE_SAME 4 > > +#define UBLK_IO_OP_WRITE_ZEROES 5 > > +/* > > + * Passthrough (driver private) operation codes range between > > This is unclear... Here, what does "driver" refer to ? If it is the ublk > kernel driver, than these commands should not be defined in the uapi > header file, they should be defined in drivers/block/ublk.h. However, if > these are for the user space driver, like all the other operations, then Like normal IO, these passthrough requests needs userspace to handle too, usually they just belong to specific ublk target, such as report zones., so here it is part of UAPI. But yes, we should document it clearly, maybe something below? Ublk passthrough operation code ranges, and each passthrough operation provides generic interface between ublk kernel driver and ublk userspace, and this interface is usually used for handling generic block layer request, such as command of zoned report zones. Passthrough operation is only needed iff ublk kernel driver has to be involved for handling this operation. > let's clearly state so. But then, I still not understand why these need > a different naming pattern using the "__UBLK" prefix... I think __UBLK just meant we don't suggest userspace to use it directly, since the added macros are just for making ranges for DRV_IN and DRV_OUT, so we can check command direction easily be using this start/end info in both sides. Thanks, Ming
On 7/7/23 09:59, Ming Lei wrote: > On Fri, Jul 07, 2023 at 08:50:01AM +0900, Damien Le Moal wrote: >> On 7/6/23 22:09, Andreas Hindborg wrote: >>> From: Andreas Hindborg <a.hindborg@samsung.com> >>> >>> Ublk zoned storage support relies on DRV_IN handling for zone report. >>> Prepare for this change by adding offsets for the DRV_IN/DRV_OUT commands. >>> >>> Also add parenthesis to existing opcodes for better macro hygiene. >>> >>> Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com> >>> --- >>> include/uapi/linux/ublk_cmd.h | 18 ++++++++++++++---- >>> 1 file changed, 14 insertions(+), 4 deletions(-) >>> >>> diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h >>> index 4b8558db90e1..2ebb8d5d827a 100644 >>> --- a/include/uapi/linux/ublk_cmd.h >>> +++ b/include/uapi/linux/ublk_cmd.h >>> @@ -229,12 +229,22 @@ struct ublksrv_ctrl_dev_info { >>> __u64 reserved2; >>> }; >>> >>> -#define UBLK_IO_OP_READ 0 >>> +#define UBLK_IO_OP_READ 0 >>> #define UBLK_IO_OP_WRITE 1 >>> #define UBLK_IO_OP_FLUSH 2 >>> -#define UBLK_IO_OP_DISCARD 3 >>> -#define UBLK_IO_OP_WRITE_SAME 4 >>> -#define UBLK_IO_OP_WRITE_ZEROES 5 >>> +#define UBLK_IO_OP_DISCARD 3 >>> +#define UBLK_IO_OP_WRITE_SAME 4 >>> +#define UBLK_IO_OP_WRITE_ZEROES 5 >>> +/* >>> + * Passthrough (driver private) operation codes range between >> >> This is unclear... Here, what does "driver" refer to ? If it is the ublk >> kernel driver, than these commands should not be defined in the uapi >> header file, they should be defined in drivers/block/ublk.h. However, if >> these are for the user space driver, like all the other operations, then > > Like normal IO, these passthrough requests needs userspace to handle too, > usually they just belong to specific ublk target, such as report zones., > so here it is part of UAPI. > > But yes, we should document it clearly, maybe something below? > > Ublk passthrough operation code ranges, and each passthrough > operation provides generic interface between ublk kernel driver > and ublk userspace, and this interface is usually used for handling > generic block layer request, such as command of zoned report zones. > Passthrough operation is only needed iff ublk kernel driver has to > be involved for handling this operation. Yes, that is better. > >> let's clearly state so. But then, I still not understand why these need >> a different naming pattern using the "__UBLK" prefix... > > I think __UBLK just meant we don't suggest userspace to use it directly, > since the added macros are just for making ranges for DRV_IN and DRV_OUT, > so we can check command direction easily be using this start/end info in > both sides. Personally, I would still prefer to not add this "__" prefix as these are operations that the ublk user driver will have to deal with, like the other ones. So I do not see the point of that prefix. But no strong feeling about that :) > > > Thanks, > Ming >
On Fri, Jul 07, 2023 at 08:59:03AM +0800, Ming Lei wrote: > > let's clearly state so. But then, I still not understand why these need > > a different naming pattern using the "__UBLK" prefix... > > I think __UBLK just meant we don't suggest userspace to use it directly, > since the added macros are just for making ranges for DRV_IN and DRV_OUT, > so we can check command direction easily be using this start/end info in > both sides. Folks, please stop coupling a uapi (or on-disk protocol) too tightly to Linux internals. Think of what makes sense as a communication protocol, not what is an internal kernel interface. REPORT_ZONES is a sensible command, and supported in ATA/SCSI/NVMe in one way or another. In Linux it is a synchronous method call right now for one reason or another, and most implementation map it to a passthrough command - be that the actual protocol command or something internal for virtio. So for ublk this is just another command like any other, that needs to be defined and documented. Nothing internal or driver specific.
On Sun, Jul 09, 2023 at 11:52:39PM -0700, Christoph Hellwig wrote: > On Fri, Jul 07, 2023 at 08:59:03AM +0800, Ming Lei wrote: > > > let's clearly state so. But then, I still not understand why these need > > > a different naming pattern using the "__UBLK" prefix... > > > > I think __UBLK just meant we don't suggest userspace to use it directly, > > since the added macros are just for making ranges for DRV_IN and DRV_OUT, > > so we can check command direction easily be using this start/end info in > > both sides. > > Folks, please stop coupling a uapi (or on-disk protocol) too tightly > to Linux internals. Think of what makes sense as a communication > protocol, not what is an internal kernel interface. > > REPORT_ZONES is a sensible command, and supported in ATA/SCSI/NVMe in > one way or another. In Linux it is a synchronous method call right now > for one reason or another, and most implementation map it to a > passthrough command - be that the actual protocol command or something > internal for virtio. > > So for ublk this is just another command like any other, that needs to > be defined and documented. Nothing internal or driver specific. Yes, that is exactly what we are doing. The added macros of UBLK_IO_OP_DRV_IN_START[END] are just for supporting more ublk passthrough commands, and the motivation is for running check(such as buffer direction) in two sides easily. However, I think it is just fine to delay to add it until introducing the 2nd ublk pt command. Thanks, Ming
On Mon, Jul 10, 2023 at 05:27:23PM +0800, Ming Lei wrote: > Yes, that is exactly what we are doing. > > The added macros of UBLK_IO_OP_DRV_IN_START[END] are just for supporting > more ublk passthrough commands, and the motivation is for running > check(such as buffer direction) in two sides easily. > > However, I think it is just fine to delay to add it until introducing > the 2nd ublk pt command. The concept of a passthrough command just doesn't make sense for an on the wire protocol. It is a linux concept that distinguished between the Linux synthetic command like REQ_OP_READ/WRITE/DISCARD etc that are well defined and can be used by file systems and other consumers, and ways to pass through arbitrary blobs only known by the driver. Anything in a wire protocol needs to be very well defined in that protocol completely indpendent of what Linux concept it maps to. Especially as the Linux concepts can change, and fairly frequently do.
On Mon, Jul 10, 2023 at 02:32:44AM -0700, Christoph Hellwig wrote: > On Mon, Jul 10, 2023 at 05:27:23PM +0800, Ming Lei wrote: > > Yes, that is exactly what we are doing. > > > > The added macros of UBLK_IO_OP_DRV_IN_START[END] are just for supporting > > more ublk passthrough commands, and the motivation is for running > > check(such as buffer direction) in two sides easily. > > > > However, I think it is just fine to delay to add it until introducing > > the 2nd ublk pt command. > > The concept of a passthrough command just doesn't make sense for an > on the wire protocol. It is a linux concept that distinguished between > the Linux synthetic command like REQ_OP_READ/WRITE/DISCARD etc that are > well defined and can be used by file systems and other consumers, and > ways to pass through arbitrary blobs only known by the driver. > > Anything in a wire protocol needs to be very well defined in that > protocol completely indpendent of what Linux concept it maps to. > Especially as the Linux concepts can change, and fairly frequently do. Yeah, you are right wrt. linux pt command, and here we shouldn't use the term of passthrough. Actually the UAPI is trying to define interface between driver and userspace, which is just like interface between driver and hardware, such as, how nvme/sd_zbc is dealing with actual hardware wrt. reporting zones. Thanks, Ming
Christoph Hellwig <hch@infradead.org> writes: > On Mon, Jul 10, 2023 at 05:27:23PM +0800, Ming Lei wrote: >> Yes, that is exactly what we are doing. >> >> The added macros of UBLK_IO_OP_DRV_IN_START[END] are just for supporting >> more ublk passthrough commands, and the motivation is for running >> check(such as buffer direction) in two sides easily. >> >> However, I think it is just fine to delay to add it until introducing >> the 2nd ublk pt command. > > The concept of a passthrough command just doesn't make sense for an > on the wire protocol. It is a linux concept that distinguished between > the Linux synthetic command like REQ_OP_READ/WRITE/DISCARD etc that are > well defined and can be used by file systems and other consumers, and > ways to pass through arbitrary blobs only known by the driver. Yet most on-the-wire protocols for actual hardware does support this some way or another. But I agree that for ublk it is probably not needed. It would probably be easier to talk to the ublk daemon through other means than passthrough in the block layer. > > Anything in a wire protocol needs to be very well defined in that > protocol completely indpendent of what Linux concept it maps to. > Especially as the Linux concepts can change, and fairly frequently do. I somewhat agree in the sense that for consistency, we should either move zone management commands to the DRV_OUT range OR move report_zones out of this special range and just next to the zone management operations. I like the latter option better, and I would love to see the block layer do the same at some point. It feels backwards that report_zones get special treatment all over the place. Best regards, Andreas
On Tue, Jul 11, 2023 at 08:23:40AM +0200, Andreas Hindborg (Samsung) wrote: > Yet most on-the-wire protocols for actual hardware does support this > some way or another. Supports what? Passthrough? No. > I somewhat agree in the sense that for consistency, we should either > move zone management commands to the DRV_OUT range OR move report_zones > out of this special range and just next to the zone management > operations. I like the latter option better, and I would love to see the > block layer do the same at some point. It feels backwards that > report_zones get special treatment all over the place. DRV_IN/OUT is purely a Linux concept. It doesn't make any sense for a wire protocol.
Christoph Hellwig <hch@infradead.org> writes: > On Tue, Jul 11, 2023 at 08:23:40AM +0200, Andreas Hindborg (Samsung) wrote: >> Yet most on-the-wire protocols for actual hardware does support this >> some way or another. > > Supports what? Passthrough? No. Both SCSI and NVMe has command identifier ranges reserved for vendor specific commands. I would assume that one use of these is to implement passthrough channels to a device for testing out new interfaces. Just guessing though. > >> I somewhat agree in the sense that for consistency, we should either >> move zone management commands to the DRV_OUT range OR move report_zones >> out of this special range and just next to the zone management >> operations. I like the latter option better, and I would love to see the >> block layer do the same at some point. It feels backwards that >> report_zones get special treatment all over the place. > > DRV_IN/OUT is purely a Linux concept. It doesn't make any sense for > a wire protocol. Ok
On Tue, Jul 11, 2023 at 11:02:15AM +0200, Andreas Hindborg (Samsung) wrote: > > Christoph Hellwig <hch@infradead.org> writes: > > > On Tue, Jul 11, 2023 at 08:23:40AM +0200, Andreas Hindborg (Samsung) wrote: > >> Yet most on-the-wire protocols for actual hardware does support this > >> some way or another. > > > > Supports what? Passthrough? No. > > Both SCSI and NVMe has command identifier ranges reserved for vendor > specific commands. I would assume that one use of these is to implement > passthrough channels to a device for testing out new interfaces. Just > guessing though. Vendor specific commands is an entirely different concept from Linux passthrough requests.
Christoph Hellwig <hch@infradead.org> writes: > On Tue, Jul 11, 2023 at 11:02:15AM +0200, Andreas Hindborg (Samsung) wrote: >> >> Christoph Hellwig <hch@infradead.org> writes: >> >> > On Tue, Jul 11, 2023 at 08:23:40AM +0200, Andreas Hindborg (Samsung) wrote: >> >> Yet most on-the-wire protocols for actual hardware does support this >> >> some way or another. >> > >> > Supports what? Passthrough? No. >> >> Both SCSI and NVMe has command identifier ranges reserved for vendor >> specific commands. I would assume that one use of these is to implement >> passthrough channels to a device for testing out new interfaces. Just >> guessing though. > > Vendor specific commands is an entirely different concept from Linux > passthrough requests. And yet they are somewhat similar, in the sense that they allow the user of a protocol to express semantics that is not captured in the established protocol. Uring command passthrough -> request passthrough -> vendor specific commands. They sort of map well in terms of what they allow the user to achieve. Or did I misunderstand something completely?
On Tue, Jul 11, 2023 at 12:15:18PM +0200, Andreas Hindborg (Samsung) wrote: > And yet they are somewhat similar, in the sense that they allow the user > of a protocol to express semantics that is not captured in the > established protocol. Uring command passthrough -> request passthrough > -> vendor specific commands. They sort of map well in terms of what they > allow the user to achieve. Or did I misunderstand something completely? Well, there is a relationship, but it's one way. Vendor specific command are basically always going to be used through a passthrough interface, because they aren't standardized. But most commands used through a passthrough interface are normal standardized commands, just either used in a way not supported by the normal Linux interface or just in creative ways.
diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h index 4b8558db90e1..2ebb8d5d827a 100644 --- a/include/uapi/linux/ublk_cmd.h +++ b/include/uapi/linux/ublk_cmd.h @@ -229,12 +229,22 @@ struct ublksrv_ctrl_dev_info { __u64 reserved2; }; -#define UBLK_IO_OP_READ 0 +#define UBLK_IO_OP_READ 0 #define UBLK_IO_OP_WRITE 1 #define UBLK_IO_OP_FLUSH 2 -#define UBLK_IO_OP_DISCARD 3 -#define UBLK_IO_OP_WRITE_SAME 4 -#define UBLK_IO_OP_WRITE_ZEROES 5 +#define UBLK_IO_OP_DISCARD 3 +#define UBLK_IO_OP_WRITE_SAME 4 +#define UBLK_IO_OP_WRITE_ZEROES 5 +/* + * Passthrough (driver private) operation codes range between + * __UBLK_IO_OP_DRV_IN_START and __UBLK_IO_OP_DRV_IN_END for REQ_OP_DRV_IN type + * operations and between __UBLK_IO_OP_DRV_OUT_START and + * __UBLK_IO_OP_DRV_OUT_END for REQ_OP_DRV_OUT type operations. + */ +#define __UBLK_IO_OP_DRV_IN_START 32 +#define __UBLK_IO_OP_DRV_IN_END 96 +#define __UBLK_IO_OP_DRV_OUT_START __UBLK_IO_OP_DRV_IN_END +#define __UBLK_IO_OP_DRV_OUT_END 160 #define UBLK_IO_F_FAILFAST_DEV (1U << 8) #define UBLK_IO_F_FAILFAST_TRANSPORT (1U << 9)