diff mbox series

[v2] public/io/blkif.h: try to fix the semantics of sector based quantities

Message ID 20190402165803.28667-1-paul.durrant@citrix.com (mailing list archive)
State Superseded
Headers show
Series [v2] public/io/blkif.h: try to fix the semantics of sector based quantities | expand

Commit Message

Paul Durrant April 2, 2019, 4:58 p.m. UTC
This is actually v2 of the patch posted in [1]. Please see the thread
starting there for more context...

The semantics of sector based quanties, such as first_sect and last_sect in
blkif_request_segment, and the value of "sectors" in the backend info
in xenstore have become confused. Some comments in the header suggest they
should be supplied/interpreted strictly in terms of 512-byte units, others
suggest they should be scaled by the value of "sector-size" i.e. the
logical block size of the underlying backend storage.
This confusion has caused mixed semantics to become ingrained in frontend
implementations. For instance Linux xen-blkfront.c contains code such as:

    fsect = offset >> 9;
    lsect = fsect + (len >> 9) - 1;

whereas the Windows XENVBD frontend contains the following equivalent code:

    Segment->FirstSector = (UCHAR)((Offset + SectorSize - 1) / SectorSize);
    *SectorsNow = __min(SectorsLeft, SectorsPerPage - Segment->FirstSector);
    Segment->LastSector = (UCHAR)(Segment->FirstSector + *SectorsNow - 1);

(where SectorSize is the "sector-size" value advertized in xenstore).

Thus it has become unsafe for a backend to set "sector-size" to anything
other than 512 as it does not know which way the frontend is coded.

This patch is intended to clarify the situation and also introduce a
mechanism to allow logical block sizes of more than 512 to be supported...

A new frontend feature node is specified: 'feature-large-sector-size'.
If this node is present and set to "1" then it means that frontend is
coded to supply and interpret all sector based quantities in terms of the
the advertized "sector-size" value rather than a hardcoded size of 512.

[1] https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg01600.html

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Athony PERARD <anthony.perard@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v2:
 - Drop Konrad's original ack since the patch has substantially changed
 - Cc several different OS blkfront maintainers and 'the rest' maintainers
   for more opinion
---
 xen/include/public/io/blkif.h | 40 ++++++++++++++++++++++++++---------
 1 file changed, 30 insertions(+), 10 deletions(-)

Comments

Julien Grall April 3, 2019, 9:07 a.m. UTC | #1
(+ Juergen as he maintains the PV I/O with Konrad)

On 02/04/2019 17:58, Paul Durrant wrote:
> This is actually v2 of the patch posted in [1]. Please see the thread
> starting there for more context...

Should not this belong after ---?

> 
> The semantics of sector based quanties, such as first_sect and last_sect in

s/quanties/quantities/

> blkif_request_segment, and the value of "sectors" in the backend info
> in xenstore have become confused. Some comments in the header suggest they
> should be supplied/interpreted strictly in terms of 512-byte units, others
> suggest they should be scaled by the value of "sector-size" i.e. the
> logical block size of the underlying backend storage.
> This confusion has caused mixed semantics to become ingrained in frontend
> implementations. For instance Linux xen-blkfront.c contains code such as:
> 
>      fsect = offset >> 9;
>      lsect = fsect + (len >> 9) - 1;
> 
> whereas the Windows XENVBD frontend contains the following equivalent code:
> 
>      Segment->FirstSector = (UCHAR)((Offset + SectorSize - 1) / SectorSize);
>      *SectorsNow = __min(SectorsLeft, SectorsPerPage - Segment->FirstSector);
>      Segment->LastSector = (UCHAR)(Segment->FirstSector + *SectorsNow - 1);
> 
> (where SectorSize is the "sector-size" value advertized in xenstore).
> 
> Thus it has become unsafe for a backend to set "sector-size" to anything
> other than 512 as it does not know which way the frontend is coded.
> 
> This patch is intended to clarify the situation and also introduce a
> mechanism to allow logical block sizes of more than 512 to be supported...
> 
> A new frontend feature node is specified: 'feature-large-sector-size'.
> If this node is present and set to "1" then it means that frontend is
> coded to supply and interpret all sector based quantities in terms of the
> the advertized "sector-size" value rather than a hardcoded size of 512.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg01600.html
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> ---
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Roger Pau Monné <roger.pau@citrix.com>
> Cc: Athony PERARD <anthony.perard@citrix.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Wei Liu <wei.liu2@citrix.com>
> 
> v2:
>   - Drop Konrad's original ack since the patch has substantially changed
>   - Cc several different OS blkfront maintainers and 'the rest' maintainers
>     for more opinion
> ---
>   xen/include/public/io/blkif.h | 40 ++++++++++++++++++++++++++---------
>   1 file changed, 30 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
> index 15a71e3fea..d4a34de94c 100644
> --- a/xen/include/public/io/blkif.h
> +++ b/xen/include/public/io/blkif.h
> @@ -254,18 +254,26 @@
>    * sector-size
>    *      Values:         <uint32_t>
>    *
> - *      The logical sector size, in bytes, of the backend device.
> + *      The logical block size, in bytes, of the underlying storage. This
> + *      must be a power of two with a minimum value of 512.
> + *
> + *      NOTE: Because of implementation bugs in some frontends this must be
> + *            set to 512, unless the frontend advertizes a non-zero value
> + *            in its "feature-large-sector-size" xenbus node. (See below).
>    *
>    * physical-sector-size
>    *      Values:         <uint32_t>
> + *      Default Value:  <"sector-size">
>    *
> - *      The physical sector size, in bytes, of the backend device.
> + *      The physical block size, in bytes, of the backend storage. This
> + *      must be an integer multiple of "sector-size".
>    *
>    * sectors
>    *      Values:         <uint64_t>
>    *
> - *      The size of the backend device, expressed in units of its logical
> - *      sector size ("sector-size").
> + *      The size of the backend device, expressed in units of "sector-size".
> + *      The product of "sector-size" and "sectors" must also be an integer
> + *      multiple of "physical-sector-size", if that node is present.
>    *
>    *****************************************************************************
>    *                            Frontend XenBus Nodes
> @@ -321,6 +329,8 @@
>    *      The size of the frontend allocated request ring buffer in units of
>    *      machine pages.  The value must be a power of 2.
>    *
> + *--------------------------------- Features ---------------------------------
> + *
>    * feature-persistent
>    *      Values:         0/1 (boolean)
>    *      Default Value:  0
> @@ -342,6 +352,17 @@
>    *      decides to limit the maximum number of persistently mapped grants
>    *      to a value less than RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST.
>    *
> + * feature-large-sector-size
> + *      Values:         0/1 (boolean)
> + *      Default Value:  0
> + *
> + *      A value of "1" indicates that the frontend will correctly supply and
> + *      interpret all sector-based quantities in terms of the "sector-size"
> + *      value supplied in the backend info, whatever that may be set to.
> + *      If this node is not present or its value is "0" then it is assumed
> + *      that the frontend requires that the logical block size is 512 as it
> + *      is hardcoded (which is the case in some frontend implementations).
> + *
>    *------------------------- Virtual Device Properties -------------------------
>    *
>    * device-type
> @@ -607,12 +628,11 @@
>   #define BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST 8
>   
>   /*
> - * NB. first_sect and last_sect in blkif_request_segment, as well as
> - * sector_number in blkif_request, are always expressed in 512-byte units.
> - * However they must be properly aligned to the real sector size of the
> - * physical disk, which is reported in the "physical-sector-size" node in
> - * the backend xenbus info. Also the xenbus "sectors" node is expressed in
> - * 512-byte units.
> + * NB. 'first_sect' and 'last_sect' in blkif_request_segment, as well as
> + * 'sector_number' in blkif_request, blkif_request_discard and
> + * blkif_request_indirect are sector-based quantities. See the description
> + * of the "feature-large-sector-size" frontend xenbus node above for
> + * more information.
>    */
>   struct blkif_request_segment {
>       grant_ref_t gref;        /* reference to I/O buffer frame        */
>
Paul Durrant April 3, 2019, 9:10 a.m. UTC | #2
> -----Original Message-----
> From: Julien Grall [mailto:julien.grall@arm.com]
> Sent: 03 April 2019 10:08
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Roger Pau Monne <roger.pau@citrix.com>; Anthony
> Perard <anthony.perard@citrix.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Jan Beulich <jbeulich@suse.com>;
> Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>; Wei Liu
> <wei.liu2@citrix.com>; Juergen Gross <jgross@suse.com>
> Subject: Re: [PATCH v2] public/io/blkif.h: try to fix the semantics of sector based quantities
> 
> (+ Juergen as he maintains the PV I/O with Konrad)
> 
> On 02/04/2019 17:58, Paul Durrant wrote:
> > This is actually v2 of the patch posted in [1]. Please see the thread
> > starting there for more context...
> 
> Should not this belong after ---?

True, it doesn't really want to go into the commit comment.

> 
> >
> > The semantics of sector based quanties, such as first_sect and last_sect in
> 
> s/quanties/quantities/
> 

Good spot. Thanks,

  Paul
Jürgen Groß April 3, 2019, 1:27 p.m. UTC | #3
On 02/04/2019 18:58, Paul Durrant wrote:
> This is actually v2 of the patch posted in [1]. Please see the thread
> starting there for more context...
> 
> The semantics of sector based quanties, such as first_sect and last_sect in
> blkif_request_segment, and the value of "sectors" in the backend info
> in xenstore have become confused. Some comments in the header suggest they
> should be supplied/interpreted strictly in terms of 512-byte units, others
> suggest they should be scaled by the value of "sector-size" i.e. the
> logical block size of the underlying backend storage.
> This confusion has caused mixed semantics to become ingrained in frontend
> implementations. For instance Linux xen-blkfront.c contains code such as:
> 
>     fsect = offset >> 9;
>     lsect = fsect + (len >> 9) - 1;
> 
> whereas the Windows XENVBD frontend contains the following equivalent code:
> 
>     Segment->FirstSector = (UCHAR)((Offset + SectorSize - 1) / SectorSize);
>     *SectorsNow = __min(SectorsLeft, SectorsPerPage - Segment->FirstSector);
>     Segment->LastSector = (UCHAR)(Segment->FirstSector + *SectorsNow - 1);
> 
> (where SectorSize is the "sector-size" value advertized in xenstore).
> 
> Thus it has become unsafe for a backend to set "sector-size" to anything
> other than 512 as it does not know which way the frontend is coded.
> 
> This patch is intended to clarify the situation and also introduce a
> mechanism to allow logical block sizes of more than 512 to be supported...
> 
> A new frontend feature node is specified: 'feature-large-sector-size'.
> If this node is present and set to "1" then it means that frontend is
> coded to supply and interpret all sector based quantities in terms of the
> the advertized "sector-size" value rather than a hardcoded size of 512.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg01600.html
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

Apart from the already mentioned commit message issues:

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen
diff mbox series

Patch

diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
index 15a71e3fea..d4a34de94c 100644
--- a/xen/include/public/io/blkif.h
+++ b/xen/include/public/io/blkif.h
@@ -254,18 +254,26 @@ 
  * sector-size
  *      Values:         <uint32_t>
  *
- *      The logical sector size, in bytes, of the backend device.
+ *      The logical block size, in bytes, of the underlying storage. This
+ *      must be a power of two with a minimum value of 512.
+ *
+ *      NOTE: Because of implementation bugs in some frontends this must be
+ *            set to 512, unless the frontend advertizes a non-zero value
+ *            in its "feature-large-sector-size" xenbus node. (See below).
  *
  * physical-sector-size
  *      Values:         <uint32_t>
+ *      Default Value:  <"sector-size">
  *
- *      The physical sector size, in bytes, of the backend device.
+ *      The physical block size, in bytes, of the backend storage. This
+ *      must be an integer multiple of "sector-size".
  *
  * sectors
  *      Values:         <uint64_t>
  *
- *      The size of the backend device, expressed in units of its logical
- *      sector size ("sector-size").
+ *      The size of the backend device, expressed in units of "sector-size".
+ *      The product of "sector-size" and "sectors" must also be an integer
+ *      multiple of "physical-sector-size", if that node is present.
  *
  *****************************************************************************
  *                            Frontend XenBus Nodes
@@ -321,6 +329,8 @@ 
  *      The size of the frontend allocated request ring buffer in units of
  *      machine pages.  The value must be a power of 2.
  *
+ *--------------------------------- Features ---------------------------------
+ *
  * feature-persistent
  *      Values:         0/1 (boolean)
  *      Default Value:  0
@@ -342,6 +352,17 @@ 
  *      decides to limit the maximum number of persistently mapped grants
  *      to a value less than RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST.
  *
+ * feature-large-sector-size
+ *      Values:         0/1 (boolean)
+ *      Default Value:  0
+ *
+ *      A value of "1" indicates that the frontend will correctly supply and
+ *      interpret all sector-based quantities in terms of the "sector-size"
+ *      value supplied in the backend info, whatever that may be set to.
+ *      If this node is not present or its value is "0" then it is assumed
+ *      that the frontend requires that the logical block size is 512 as it
+ *      is hardcoded (which is the case in some frontend implementations).
+ *
  *------------------------- Virtual Device Properties -------------------------
  *
  * device-type
@@ -607,12 +628,11 @@ 
 #define BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST 8
 
 /*
- * NB. first_sect and last_sect in blkif_request_segment, as well as
- * sector_number in blkif_request, are always expressed in 512-byte units.
- * However they must be properly aligned to the real sector size of the
- * physical disk, which is reported in the "physical-sector-size" node in
- * the backend xenbus info. Also the xenbus "sectors" node is expressed in
- * 512-byte units.
+ * NB. 'first_sect' and 'last_sect' in blkif_request_segment, as well as
+ * 'sector_number' in blkif_request, blkif_request_discard and
+ * blkif_request_indirect are sector-based quantities. See the description
+ * of the "feature-large-sector-size" frontend xenbus node above for
+ * more information.
  */
 struct blkif_request_segment {
     grant_ref_t gref;        /* reference to I/O buffer frame        */