Message ID | 20191205140123.3817-4-pdurrant@amazon.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen-blkback: support live update | expand |
On Thu, Dec 05, 2019 at 02:01:22PM +0000, Paul Durrant wrote: > Currently these macros will skip over any requests/responses that are > added to the shared ring whilst it is detached. This, in general, is not > a desirable semantic since most frontend implementations will eventually > block waiting for a response which would either never appear or never be > processed. > > NOTE: These macros are currently unused. BACK_RING_ATTACH(), however, will > be used in a subsequent patch. > > Signed-off-by: Paul Durrant <pdurrant@amazon.com> Those headers come from Xen, and should be modified in Xen first and then imported into Linux IMO. Thanks, Roger.
On 09.12.19 12:41, Roger Pau Monné wrote: > On Thu, Dec 05, 2019 at 02:01:22PM +0000, Paul Durrant wrote: >> Currently these macros will skip over any requests/responses that are >> added to the shared ring whilst it is detached. This, in general, is not >> a desirable semantic since most frontend implementations will eventually >> block waiting for a response which would either never appear or never be >> processed. >> >> NOTE: These macros are currently unused. BACK_RING_ATTACH(), however, will >> be used in a subsequent patch. >> >> Signed-off-by: Paul Durrant <pdurrant@amazon.com> > > Those headers come from Xen, and should be modified in Xen first and > then imported into Linux IMO. In theory, yes. But the Xen variant doesn't contain the ATTACH macros. Juergen
> -----Original Message----- > From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of > Jürgen Groß > Sent: 09 December 2019 11:52 > To: Roger Pau Monné <roger.pau@citrix.com>; Durrant, Paul > <pdurrant@amazon.com> > Cc: xen-devel@lists.xenproject.org; Boris Ostrovsky > <boris.ostrovsky@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; > linux-kernel@vger.kernel.org > Subject: Re: [Xen-devel] [PATCH 3/4] xen/interface: don't discard pending > work in FRONT/BACK_RING_ATTACH > > On 09.12.19 12:41, Roger Pau Monné wrote: > > On Thu, Dec 05, 2019 at 02:01:22PM +0000, Paul Durrant wrote: > >> Currently these macros will skip over any requests/responses that are > >> added to the shared ring whilst it is detached. This, in general, is > not > >> a desirable semantic since most frontend implementations will > eventually > >> block waiting for a response which would either never appear or never > be > >> processed. > >> > >> NOTE: These macros are currently unused. BACK_RING_ATTACH(), however, > will > >> be used in a subsequent patch. > >> > >> Signed-off-by: Paul Durrant <pdurrant@amazon.com> > > > > Those headers come from Xen, and should be modified in Xen first and > > then imported into Linux IMO. > > In theory, yes. But the Xen variant doesn't contain the ATTACH macros. > OOI do we have a policy about this? Re-importing headers into Linux wholesale is always slightly painful because of interdependencies and style checking issues. Paul > > Juergen > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel
On 05.12.19 15:01, Paul Durrant wrote: > Currently these macros will skip over any requests/responses that are > added to the shared ring whilst it is detached. This, in general, is not > a desirable semantic since most frontend implementations will eventually > block waiting for a response which would either never appear or never be > processed. > > NOTE: These macros are currently unused. BACK_RING_ATTACH(), however, will > be used in a subsequent patch. > > Signed-off-by: Paul Durrant <pdurrant@amazon.com> > --- > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Cc: Juergen Gross <jgross@suse.com> > Cc: Stefano Stabellini <sstabellini@kernel.org> > --- > include/xen/interface/io/ring.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/xen/interface/io/ring.h b/include/xen/interface/io/ring.h > index 3f40501fc60b..405adfed87e6 100644 > --- a/include/xen/interface/io/ring.h > +++ b/include/xen/interface/io/ring.h > @@ -143,14 +143,14 @@ struct __name##_back_ring { \ > #define FRONT_RING_ATTACH(_r, _s, __size) do { \ > (_r)->sring = (_s); \ > (_r)->req_prod_pvt = (_s)->req_prod; \ > - (_r)->rsp_cons = (_s)->rsp_prod; \ > + (_r)->rsp_cons = (_s)->req_prod; \ > (_r)->nr_ents = __RING_SIZE(_s, __size); \ > } while (0) > > #define BACK_RING_ATTACH(_r, _s, __size) do { \ > (_r)->sring = (_s); \ > (_r)->rsp_prod_pvt = (_s)->rsp_prod; \ > - (_r)->req_cons = (_s)->req_prod; \ > + (_r)->req_cons = (_s)->rsp_prod; \ > (_r)->nr_ents = __RING_SIZE(_s, __size); \ > } while (0) Lets look at all possible scenarios where BACK_RING_ATTACH() might happen: Initially (after [FRONT|BACK]_RING_INIT(), leaving _pvt away): req_prod=0, rsp_cons=0, rsp_prod=0, req_cons=0 Using BACK_RING_ATTACH() is fine (no change) Request queued: req_prod=1, rsp_cons=0, rsp_prod=0, req_cons=0 Using BACK_RING_ATTACH() is fine (no change) and taken by backend: req_prod=1, rsp_cons=0, rsp_prod=0, req_cons=1 Using BACK_RING_ATTACH() is resetting req_cons to 0, will result in redoing request (for blk this is fine, other devices like SCSI tapes will have issues with that). One possible solution would be to ensure all taken requests are either stopped or the response is queued already. Response queued: req_prod=1, rsp_cons=0, rsp_prod=1, req_cons=1 Using BACK_RING_ATTACH() is fine (no change) Response taken: req_prod=1, rsp_cons=1, rsp_prod=1, req_cons=1 Using BACK_RING_ATTACH() is fine (no change) In general I believe the [FRONT|BACK]_RING_ATTACH() macros are not fine to be used in the current state, as the *_pvt fields normally not accessible by the other end are initialized using the (possibly untrusted) values from the shared ring. There needs at least to be a test for the values to be sane, and your change should not result in the same value to be read twice, as it could have changed in between. As this is an error which can happen in other OS's, too, I'd recommend to add the adapted macros (plus a comment regarding the possible problem noted above for special devices like tapes) to the Xen variant of ring.h. Juergen
> -----Original Message----- > From: Jürgen Groß <jgross@suse.com> > Sent: 09 December 2019 13:55 > To: Durrant, Paul <pdurrant@amazon.com>; linux-kernel@vger.kernel.org; > xen-devel@lists.xenproject.org > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>; Stefano Stabellini > <sstabellini@kernel.org> > Subject: Re: [PATCH 3/4] xen/interface: don't discard pending work in > FRONT/BACK_RING_ATTACH > > On 05.12.19 15:01, Paul Durrant wrote: > > Currently these macros will skip over any requests/responses that are > > added to the shared ring whilst it is detached. This, in general, is not > > a desirable semantic since most frontend implementations will eventually > > block waiting for a response which would either never appear or never be > > processed. > > > > NOTE: These macros are currently unused. BACK_RING_ATTACH(), however, > will > > be used in a subsequent patch. > > > > Signed-off-by: Paul Durrant <pdurrant@amazon.com> > > --- > > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > > Cc: Juergen Gross <jgross@suse.com> > > Cc: Stefano Stabellini <sstabellini@kernel.org> > > --- > > include/xen/interface/io/ring.h | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/include/xen/interface/io/ring.h > b/include/xen/interface/io/ring.h > > index 3f40501fc60b..405adfed87e6 100644 > > --- a/include/xen/interface/io/ring.h > > +++ b/include/xen/interface/io/ring.h > > @@ -143,14 +143,14 @@ struct __name##_back_ring { > \ > > #define FRONT_RING_ATTACH(_r, _s, __size) do { \ > > (_r)->sring = (_s); \ > > (_r)->req_prod_pvt = (_s)->req_prod; \ > > - (_r)->rsp_cons = (_s)->rsp_prod; \ > > + (_r)->rsp_cons = (_s)->req_prod; \ > > (_r)->nr_ents = __RING_SIZE(_s, __size); \ > > } while (0) > > > > #define BACK_RING_ATTACH(_r, _s, __size) do { \ > > (_r)->sring = (_s); \ > > (_r)->rsp_prod_pvt = (_s)->rsp_prod; \ > > - (_r)->req_cons = (_s)->req_prod; \ > > + (_r)->req_cons = (_s)->rsp_prod; \ > > (_r)->nr_ents = __RING_SIZE(_s, __size); \ > > } while (0) > > Lets look at all possible scenarios where BACK_RING_ATTACH() > might happen: > > Initially (after [FRONT|BACK]_RING_INIT(), leaving _pvt away): > req_prod=0, rsp_cons=0, rsp_prod=0, req_cons=0 > Using BACK_RING_ATTACH() is fine (no change) > > Request queued: > req_prod=1, rsp_cons=0, rsp_prod=0, req_cons=0 > Using BACK_RING_ATTACH() is fine (no change) > > and taken by backend: > req_prod=1, rsp_cons=0, rsp_prod=0, req_cons=1 > Using BACK_RING_ATTACH() is resetting req_cons to 0, will result > in redoing request (for blk this is fine, other devices like SCSI > tapes will have issues with that). One possible solution would be > to ensure all taken requests are either stopped or the response > is queued already. Yes, it is the assumption that a backend will drain and complete any requests it is handling, but it will not deal with new ones being posted by the frontend. This does appear to be the case for blkback. > > Response queued: > req_prod=1, rsp_cons=0, rsp_prod=1, req_cons=1 > Using BACK_RING_ATTACH() is fine (no change) > > Response taken: > req_prod=1, rsp_cons=1, rsp_prod=1, req_cons=1 > Using BACK_RING_ATTACH() is fine (no change) > > In general I believe the [FRONT|BACK]_RING_ATTACH() macros are not > fine to be used in the current state, as the *_pvt fields normally not > accessible by the other end are initialized using the (possibly > untrusted) values from the shared ring. There needs at least to be a > test for the values to be sane, and your change should not result in the > same value to be read twice, as it could have changed in between. What test would you apply to sanitize the value of the pvt pointer? Another option would be to have a backend write its pvt value into the xenstore backend area when the ring is unmapped, so that a new instance definitely resumes where the old one left off. The value of rsp_prod could, of course, be overwritten by the guest at any time and so there's little point in attempting sanitize it. > > As this is an error which can happen in other OS's, too, I'd recommend > to add the adapted macros (plus a comment regarding the possible > problem noted above for special devices like tapes) to the Xen variant > of ring.h. > I can certainly send a patch to Xen once we agree on the final definition. Paul > > Juergen
On 09.12.19 17:38, Durrant, Paul wrote: >> -----Original Message----- >> From: Jürgen Groß <jgross@suse.com> >> Sent: 09 December 2019 13:55 >> To: Durrant, Paul <pdurrant@amazon.com>; linux-kernel@vger.kernel.org; >> xen-devel@lists.xenproject.org >> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>; Stefano Stabellini >> <sstabellini@kernel.org> >> Subject: Re: [PATCH 3/4] xen/interface: don't discard pending work in >> FRONT/BACK_RING_ATTACH >> >> On 05.12.19 15:01, Paul Durrant wrote: >>> Currently these macros will skip over any requests/responses that are >>> added to the shared ring whilst it is detached. This, in general, is not >>> a desirable semantic since most frontend implementations will eventually >>> block waiting for a response which would either never appear or never be >>> processed. >>> >>> NOTE: These macros are currently unused. BACK_RING_ATTACH(), however, >> will >>> be used in a subsequent patch. >>> >>> Signed-off-by: Paul Durrant <pdurrant@amazon.com> >>> --- >>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> >>> Cc: Juergen Gross <jgross@suse.com> >>> Cc: Stefano Stabellini <sstabellini@kernel.org> >>> --- >>> include/xen/interface/io/ring.h | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/include/xen/interface/io/ring.h >> b/include/xen/interface/io/ring.h >>> index 3f40501fc60b..405adfed87e6 100644 >>> --- a/include/xen/interface/io/ring.h >>> +++ b/include/xen/interface/io/ring.h >>> @@ -143,14 +143,14 @@ struct __name##_back_ring { >> \ >>> #define FRONT_RING_ATTACH(_r, _s, __size) do { \ >>> (_r)->sring = (_s); \ >>> (_r)->req_prod_pvt = (_s)->req_prod; \ >>> - (_r)->rsp_cons = (_s)->rsp_prod; \ >>> + (_r)->rsp_cons = (_s)->req_prod; \ >>> (_r)->nr_ents = __RING_SIZE(_s, __size); \ >>> } while (0) >>> >>> #define BACK_RING_ATTACH(_r, _s, __size) do { \ >>> (_r)->sring = (_s); \ >>> (_r)->rsp_prod_pvt = (_s)->rsp_prod; \ >>> - (_r)->req_cons = (_s)->req_prod; \ >>> + (_r)->req_cons = (_s)->rsp_prod; \ >>> (_r)->nr_ents = __RING_SIZE(_s, __size); \ >>> } while (0) >> >> Lets look at all possible scenarios where BACK_RING_ATTACH() >> might happen: >> >> Initially (after [FRONT|BACK]_RING_INIT(), leaving _pvt away): >> req_prod=0, rsp_cons=0, rsp_prod=0, req_cons=0 >> Using BACK_RING_ATTACH() is fine (no change) >> >> Request queued: >> req_prod=1, rsp_cons=0, rsp_prod=0, req_cons=0 >> Using BACK_RING_ATTACH() is fine (no change) >> >> and taken by backend: >> req_prod=1, rsp_cons=0, rsp_prod=0, req_cons=1 >> Using BACK_RING_ATTACH() is resetting req_cons to 0, will result >> in redoing request (for blk this is fine, other devices like SCSI >> tapes will have issues with that). One possible solution would be >> to ensure all taken requests are either stopped or the response >> is queued already. > > Yes, it is the assumption that a backend will drain and complete any requests it is handling, but it will not deal with new ones being posted by the frontend. This does appear to be the case for blkback. > >> >> Response queued: >> req_prod=1, rsp_cons=0, rsp_prod=1, req_cons=1 >> Using BACK_RING_ATTACH() is fine (no change) >> >> Response taken: >> req_prod=1, rsp_cons=1, rsp_prod=1, req_cons=1 >> Using BACK_RING_ATTACH() is fine (no change) >> >> In general I believe the [FRONT|BACK]_RING_ATTACH() macros are not >> fine to be used in the current state, as the *_pvt fields normally not >> accessible by the other end are initialized using the (possibly >> untrusted) values from the shared ring. There needs at least to be a >> test for the values to be sane, and your change should not result in the >> same value to be read twice, as it could have changed in between. > > What test would you apply to sanitize the value of the pvt pointer? For the BACK_RING_ATTACH() case rsp_prod_pvt should not be between req_prod and req_cons, and req_cons - rsp_prod_pvt should be <= ring size IMO. > Another option would be to have a backend write its pvt value into the xenstore backend area when the ring is unmapped, so that a new instance definitely resumes where the old one left off. The value of rsp_prod could, of course, be overwritten by the guest at any time and so there's little point in attempting sanitize it. I don't think this would be necessary. With above validation in place all the guest could do would be to shoot itself in the foot. Juergen
diff --git a/include/xen/interface/io/ring.h b/include/xen/interface/io/ring.h index 3f40501fc60b..405adfed87e6 100644 --- a/include/xen/interface/io/ring.h +++ b/include/xen/interface/io/ring.h @@ -143,14 +143,14 @@ struct __name##_back_ring { \ #define FRONT_RING_ATTACH(_r, _s, __size) do { \ (_r)->sring = (_s); \ (_r)->req_prod_pvt = (_s)->req_prod; \ - (_r)->rsp_cons = (_s)->rsp_prod; \ + (_r)->rsp_cons = (_s)->req_prod; \ (_r)->nr_ents = __RING_SIZE(_s, __size); \ } while (0) #define BACK_RING_ATTACH(_r, _s, __size) do { \ (_r)->sring = (_s); \ (_r)->rsp_prod_pvt = (_s)->rsp_prod; \ - (_r)->req_cons = (_s)->req_prod; \ + (_r)->req_cons = (_s)->rsp_prod; \ (_r)->nr_ents = __RING_SIZE(_s, __size); \ } while (0)
Currently these macros will skip over any requests/responses that are added to the shared ring whilst it is detached. This, in general, is not a desirable semantic since most frontend implementations will eventually block waiting for a response which would either never appear or never be processed. NOTE: These macros are currently unused. BACK_RING_ATTACH(), however, will be used in a subsequent patch. Signed-off-by: Paul Durrant <pdurrant@amazon.com> --- Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> --- include/xen/interface/io/ring.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)