Message ID | 20200728113712.22966-5-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Multiple fixes to XENMEM_acquire_resource | expand |
> -----Original Message----- > From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Andrew Cooper > Sent: 28 July 2020 12:37 > To: Xen-devel <xen-devel@lists.xenproject.org> > Cc: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>; Stefano Stabellini <sstabellini@kernel.org>; Julien > Grall <julien@xen.org>; Wei Liu <wl@xen.org>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; George > Dunlap <George.Dunlap@eu.citrix.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Paul Durrant > <paul@xen.org>; Jan Beulich <JBeulich@suse.com>; Michał Leszczyński <michal.leszczynski@cert.pl>; Ian > Jackson <ian.jackson@citrix.com> > Subject: [PATCH 4/5] xen/memory: Fix acquire_resource size semantics > > Calling XENMEM_acquire_resource with a NULL frame_list is a request for the > size of the resource, but the returned 32 is bogus. > > If someone tries to follow it for XENMEM_resource_ioreq_server, the acquire > call will fail as IOREQ servers currently top out at 2 frames, and it is only > half the size of the default grant table limit for guests. > > Also, no users actually request a resource size, because it was never wired up > in the sole implemenation of resource aquisition in Linux. > > Introduce a new resource_max_frames() to calculate the size of a resource, and > implement it the IOREQ and grant subsystems. > > It is impossible to guarentee that a mapping call following a successful size s/guarantee/guarantee > call will succedd (e.g. The target IOREQ server gets destroyed, or the domain s/succedd/succeed > switches from grant v2 to v1). Document the restriction, and use the > flexibility to simplify the paths to be lockless. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > CC: George Dunlap <George.Dunlap@eu.citrix.com> > CC: Ian Jackson <ian.jackson@citrix.com> > CC: Jan Beulich <JBeulich@suse.com> > CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > CC: Stefano Stabellini <sstabellini@kernel.org> > CC: Wei Liu <wl@xen.org> > CC: Julien Grall <julien@xen.org> > CC: Paul Durrant <paul@xen.org> > CC: Michał Leszczyński <michal.leszczynski@cert.pl> > CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl> > --- > xen/arch/x86/mm.c | 20 ++++++++++++++++ > xen/common/grant_table.c | 19 +++++++++++++++ > xen/common/memory.c | 55 +++++++++++++++++++++++++++++++++---------- > xen/include/asm-x86/mm.h | 3 +++ > xen/include/public/memory.h | 16 +++++++++---- > xen/include/xen/grant_table.h | 8 +++++++ > xen/include/xen/mm.h | 6 +++++ > 7 files changed, 110 insertions(+), 17 deletions(-) > > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c > index 82bc676553..f73a90a2ab 100644 > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -4600,6 +4600,26 @@ int xenmem_add_to_physmap_one( > return rc; > } > > +unsigned int arch_resource_max_frames( > + struct domain *d, unsigned int type, unsigned int id) > +{ > + unsigned int nr = 0; > + > + switch ( type ) > + { > +#ifdef CONFIG_HVM > + case XENMEM_resource_ioreq_server: > + if ( !is_hvm_domain(d) ) > + break; > + /* One frame for the buf-ioreq ring, and one frame per 128 vcpus. */ > + nr = 1 + DIV_ROUND_UP(d->max_vcpus * sizeof(struct ioreq), PAGE_SIZE); > + break; > +#endif > + } > + > + return nr; > +} > + > int arch_acquire_resource(struct domain *d, unsigned int type, > unsigned int id, unsigned long frame, > unsigned int nr_frames, xen_pfn_t mfn_list[]) > diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c > index 122d1e7596..0962fc7169 100644 > --- a/xen/common/grant_table.c > +++ b/xen/common/grant_table.c > @@ -4013,6 +4013,25 @@ static int gnttab_get_shared_frame_mfn(struct domain *d, > return 0; > } > > +unsigned int gnttab_resource_max_frames(struct domain *d, unsigned int id) > +{ > + unsigned int nr = 0; > + > + /* Don't need the grant lock. This limit is fixed at domain create time. */ > + switch ( id ) > + { > + case XENMEM_resource_grant_table_id_shared: > + nr = d->grant_table->max_grant_frames; > + break; > + > + case XENMEM_resource_grant_table_id_status: > + nr = grant_to_status_frames(d->grant_table->max_grant_frames); Two uses of d->grant_table, so perhaps define a stack variable for it? Also, should you not make sure 0 is returned in the case of a v1 table? > + break; > + } > + > + return nr; > +} > + > int gnttab_acquire_resource( > struct domain *d, unsigned int id, unsigned long frame, > unsigned int nr_frames, xen_pfn_t mfn_list[]) > diff --git a/xen/common/memory.c b/xen/common/memory.c > index dc3a7248e3..21edabf9cc 100644 > --- a/xen/common/memory.c > +++ b/xen/common/memory.c > @@ -1007,6 +1007,26 @@ static long xatp_permission_check(struct domain *d, unsigned int space) > return xsm_add_to_physmap(XSM_TARGET, current->domain, d); > } > > +/* > + * Return 0 on any kind of error. Caller converts to -EINVAL. > + * > + * All nonzero values should be repeatable (i.e. derived from some fixed > + * proerty of the domain), and describe the full resource (i.e. mapping the s/property/property > + * result of this call will be the entire resource). This precludes dynamically adding a resource to a running domain. Do we really want to bake in that restriction? > + */ > +static unsigned int resource_max_frames(struct domain *d, > + unsigned int type, unsigned int id) > +{ > + switch ( type ) > + { > + case XENMEM_resource_grant_table: > + return gnttab_resource_max_frames(d, id); > + > + default: > + return arch_resource_max_frames(d, type, id); > + } > +} > + > static int acquire_resource( > XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg) > { > @@ -1018,6 +1038,7 @@ static int acquire_resource( > * use-cases then per-CPU arrays or heap allocations may be required. > */ > xen_pfn_t mfn_list[32]; > + unsigned int max_frames; > int rc; > > if ( copy_from_guest(&xmar, arg, 1) ) > @@ -1026,19 +1047,6 @@ static int acquire_resource( > if ( xmar.pad != 0 ) > return -EINVAL; > > - if ( guest_handle_is_null(xmar.frame_list) ) > - { > - if ( xmar.nr_frames ) > - return -EINVAL; > - > - xmar.nr_frames = ARRAY_SIZE(mfn_list); > - > - if ( __copy_field_to_guest(arg, &xmar, nr_frames) ) > - return -EFAULT; > - > - return 0; > - } > - > if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) ) > return -E2BIG; > > @@ -1050,6 +1058,27 @@ static int acquire_resource( > if ( rc ) > goto out; > > + max_frames = resource_max_frames(d, xmar.type, xmar.id); > + > + rc = -EINVAL; > + if ( !max_frames ) > + goto out; > + > + if ( guest_handle_is_null(xmar.frame_list) ) > + { > + if ( xmar.nr_frames ) > + goto out; > + > + xmar.nr_frames = max_frames; > + > + rc = -EFAULT; > + if ( __copy_field_to_guest(arg, &xmar, nr_frames) ) > + goto out; > + > + rc = 0; > + goto out; > + } > + > switch ( xmar.type ) > { > case XENMEM_resource_grant_table: > diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h > index 7e74996053..b0caf372a8 100644 > --- a/xen/include/asm-x86/mm.h > +++ b/xen/include/asm-x86/mm.h > @@ -649,6 +649,9 @@ static inline bool arch_mfn_in_directmap(unsigned long mfn) > return mfn <= (virt_to_mfn(eva - 1) + 1); > } > > +unsigned int arch_resource_max_frames(struct domain *d, > + unsigned int type, unsigned int id); > + > int arch_acquire_resource(struct domain *d, unsigned int type, > unsigned int id, unsigned long frame, > unsigned int nr_frames, xen_pfn_t mfn_list[]); > diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h > index 21057ed78e..cea88cf40c 100644 > --- a/xen/include/public/memory.h > +++ b/xen/include/public/memory.h > @@ -639,10 +639,18 @@ struct xen_mem_acquire_resource { > #define XENMEM_resource_grant_table_id_status 1 > > /* > - * IN/OUT - As an IN parameter number of frames of the resource > - * to be mapped. However, if the specified value is 0 and > - * frame_list is NULL then this field will be set to the > - * maximum value supported by the implementation on return. > + * IN/OUT > + * > + * As an IN parameter number of frames of the resource to be mapped. > + * > + * When frame_list is NULL and nr_frames is 0, this is interpreted as a > + * request for the size of the resource, which shall be returned in the > + * nr_frames field. > + * > + * The size of a resource will never be zero, but a nonzero result doesn't > + * guarentee that a subsequent mapping request will be successful. There s/guarantee/guarantee Paul > + * are further type/id specific constraints which may change between the > + * two calls. > */ > uint32_t nr_frames; > uint32_t pad; > diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h > index 5a2c75b880..bae4d79623 100644 > --- a/xen/include/xen/grant_table.h > +++ b/xen/include/xen/grant_table.h > @@ -57,6 +57,8 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref, > int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn, > mfn_t *mfn); > > +unsigned int gnttab_resource_max_frames(struct domain *d, unsigned int id); > + > int gnttab_acquire_resource( > struct domain *d, unsigned int id, unsigned long frame, > unsigned int nr_frames, xen_pfn_t mfn_list[]); > @@ -93,6 +95,12 @@ static inline int gnttab_map_frame(struct domain *d, unsigned long idx, > return -EINVAL; > } > > +static inline unsigned int gnttab_resource_max_frames( > + struct domain *d, unsigned int id) > +{ > + return 0; > +} > + > static inline int gnttab_acquire_resource( > struct domain *d, unsigned int id, unsigned long frame, > unsigned int nr_frames, xen_pfn_t mfn_list[]) > diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h > index 1b2c1f6b32..c184dc1db1 100644 > --- a/xen/include/xen/mm.h > +++ b/xen/include/xen/mm.h > @@ -686,6 +686,12 @@ static inline void put_page_alloc_ref(struct page_info *page) > } > > #ifndef CONFIG_ARCH_ACQUIRE_RESOURCE > +static inline unsigned int arch_resource_max_frames( > + struct domain *d, unsigned int type, unsigned int id) > +{ > + return 0; > +} > + > static inline int arch_acquire_resource( > struct domain *d, unsigned int type, unsigned int id, unsigned long frame, > unsigned int nr_frames, xen_pfn_t mfn_list[]) > -- > 2.11.0 >
Hi Paul, On 30/07/2020 09:31, Paul Durrant wrote: >> diff --git a/xen/common/memory.c b/xen/common/memory.c >> index dc3a7248e3..21edabf9cc 100644 >> --- a/xen/common/memory.c >> +++ b/xen/common/memory.c >> @@ -1007,6 +1007,26 @@ static long xatp_permission_check(struct domain *d, unsigned int space) >> return xsm_add_to_physmap(XSM_TARGET, current->domain, d); >> } >> >> +/* >> + * Return 0 on any kind of error. Caller converts to -EINVAL. >> + * >> + * All nonzero values should be repeatable (i.e. derived from some fixed >> + * proerty of the domain), and describe the full resource (i.e. mapping the > > s/property/property > >> + * result of this call will be the entire resource). > > This precludes dynamically adding a resource to a running domain. Do we really want to bake in that restriction? AFAICT, this restriction is not documented in the ABI. In particular, it is written: " The size of a resource will never be zero, but a nonzero result doesn't guarentee that a subsequent mapping request will be successful. There are further type/id specific constraints which may change between the two calls. " So I think a domain couldn't rely on this behavior. Although, it might be good to clarify in the comment on top of resource_max_frames that this an implementation decision and not part of the ABI. Cheers,
On 30/07/2020 09:31, Paul Durrant wrote: >> -----Original Message----- >> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Andrew Cooper >> Sent: 28 July 2020 12:37 >> To: Xen-devel <xen-devel@lists.xenproject.org> >> Cc: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>; Stefano Stabellini <sstabellini@kernel.org>; Julien >> Grall <julien@xen.org>; Wei Liu <wl@xen.org>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; George >> Dunlap <George.Dunlap@eu.citrix.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Paul Durrant >> <paul@xen.org>; Jan Beulich <JBeulich@suse.com>; Michał Leszczyński <michal.leszczynski@cert.pl>; Ian >> Jackson <ian.jackson@citrix.com> >> Subject: [PATCH 4/5] xen/memory: Fix acquire_resource size semantics >> >> Calling XENMEM_acquire_resource with a NULL frame_list is a request for the >> size of the resource, but the returned 32 is bogus. >> >> If someone tries to follow it for XENMEM_resource_ioreq_server, the acquire >> call will fail as IOREQ servers currently top out at 2 frames, and it is only >> half the size of the default grant table limit for guests. >> >> Also, no users actually request a resource size, because it was never wired up >> in the sole implemenation of resource aquisition in Linux. >> >> Introduce a new resource_max_frames() to calculate the size of a resource, and >> implement it the IOREQ and grant subsystems. >> >> It is impossible to guarentee that a mapping call following a successful size > s/guarantee/guarantee > >> call will succedd (e.g. The target IOREQ server gets destroyed, or the domain > s/succedd/succeed > >> switches from grant v2 to v1). Document the restriction, and use the >> flexibility to simplify the paths to be lockless. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> --- >> CC: George Dunlap <George.Dunlap@eu.citrix.com> >> CC: Ian Jackson <ian.jackson@citrix.com> >> CC: Jan Beulich <JBeulich@suse.com> >> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >> CC: Stefano Stabellini <sstabellini@kernel.org> >> CC: Wei Liu <wl@xen.org> >> CC: Julien Grall <julien@xen.org> >> CC: Paul Durrant <paul@xen.org> >> CC: Michał Leszczyński <michal.leszczynski@cert.pl> >> CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl> >> --- >> xen/arch/x86/mm.c | 20 ++++++++++++++++ >> xen/common/grant_table.c | 19 +++++++++++++++ >> xen/common/memory.c | 55 +++++++++++++++++++++++++++++++++---------- >> xen/include/asm-x86/mm.h | 3 +++ >> xen/include/public/memory.h | 16 +++++++++---- >> xen/include/xen/grant_table.h | 8 +++++++ >> xen/include/xen/mm.h | 6 +++++ >> 7 files changed, 110 insertions(+), 17 deletions(-) >> >> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c >> index 82bc676553..f73a90a2ab 100644 >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -4600,6 +4600,26 @@ int xenmem_add_to_physmap_one( >> return rc; >> } >> >> +unsigned int arch_resource_max_frames( >> + struct domain *d, unsigned int type, unsigned int id) >> +{ >> + unsigned int nr = 0; >> + >> + switch ( type ) >> + { >> +#ifdef CONFIG_HVM >> + case XENMEM_resource_ioreq_server: >> + if ( !is_hvm_domain(d) ) >> + break; >> + /* One frame for the buf-ioreq ring, and one frame per 128 vcpus. */ >> + nr = 1 + DIV_ROUND_UP(d->max_vcpus * sizeof(struct ioreq), PAGE_SIZE); >> + break; >> +#endif >> + } >> + >> + return nr; >> +} >> + >> int arch_acquire_resource(struct domain *d, unsigned int type, >> unsigned int id, unsigned long frame, >> unsigned int nr_frames, xen_pfn_t mfn_list[]) >> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c >> index 122d1e7596..0962fc7169 100644 >> --- a/xen/common/grant_table.c >> +++ b/xen/common/grant_table.c >> @@ -4013,6 +4013,25 @@ static int gnttab_get_shared_frame_mfn(struct domain *d, >> return 0; >> } >> >> +unsigned int gnttab_resource_max_frames(struct domain *d, unsigned int id) >> +{ >> + unsigned int nr = 0; >> + >> + /* Don't need the grant lock. This limit is fixed at domain create time. */ >> + switch ( id ) >> + { >> + case XENMEM_resource_grant_table_id_shared: >> + nr = d->grant_table->max_grant_frames; >> + break; >> + >> + case XENMEM_resource_grant_table_id_status: >> + nr = grant_to_status_frames(d->grant_table->max_grant_frames); > Two uses of d->grant_table, so perhaps define a stack variable for it? Can do. > Also, should you not make sure 0 is returned in the case of a v1 table? This was the case specifically discussed in the commit message, but perhaps it needs expanding. Doing so would be buggy. Some utility is going to query the resource size, and then try to map it (if it doesn't blindly know the size and/or subset it cares about already). In between these two hypercalls from the utility, the guest can do a v1=>v2 or v2=>v1 switch and make the resource spontaneously appear or disappear. The only case where we can know for certain whether the resource is available is when we're in the map hypercall. Therefore, userspace has to be able to get to the map call if there is potentially a resource available. The semantics of the size call are really "this resource might exist, and if it does, this is how large it is". As for the grant status frames specifically, I think making them a mappable resource might have been a poor choice in hind sight. Only the guest can switch between grant versions. GNTTABOP_set_version strictly operates on current, unlike most of the other grant hypercalls which take a domid and let dom0 specify something other than DOMID_SELF. There is GNTTABOP_get_version, but it is racy to use in the same way as described above, and if some utility does successfully map the status frames, what will happen in practice is that a guest attempting to switch from v2 back to v1 will have the set_version hypercall fail due to outstanding refs on the frames. ~Andrew
On 30/07/2020 13:54, Julien Grall wrote: > Hi Paul, > > On 30/07/2020 09:31, Paul Durrant wrote: >>> diff --git a/xen/common/memory.c b/xen/common/memory.c >>> index dc3a7248e3..21edabf9cc 100644 >>> --- a/xen/common/memory.c >>> +++ b/xen/common/memory.c >>> @@ -1007,6 +1007,26 @@ static long xatp_permission_check(struct >>> domain *d, unsigned int space) >>> return xsm_add_to_physmap(XSM_TARGET, current->domain, d); >>> } >>> >>> +/* >>> + * Return 0 on any kind of error. Caller converts to -EINVAL. >>> + * >>> + * All nonzero values should be repeatable (i.e. derived from some >>> fixed >>> + * proerty of the domain), and describe the full resource (i.e. >>> mapping the >> >> s/property/property >> >>> + * result of this call will be the entire resource). >> >> This precludes dynamically adding a resource to a running domain. Do >> we really want to bake in that restriction? > > AFAICT, this restriction is not documented in the ABI. In particular, > it is written: > > " > The size of a resource will never be zero, but a nonzero result doesn't > guarentee that a subsequent mapping request will be successful. There > are further type/id specific constraints which may change between the > two calls. > " > > So I think a domain couldn't rely on this behavior. Although, it might > be good to clarify in the comment on top of resource_max_frames that > this an implementation decision and not part of the ABI. There are two aspects here. First, yes - I deliberately didn't state it in the ABI, just in case we might want to use it in the future. I could theoretically foresee using -EBUSY for the purpose. That said however, we are currently deliberately taking dynamic resources out of Xen, because they've proved to be unnecessary in practice and a fertile source of complexity and security bugs. I don't foresee accepting new dynamic resources, but that's not to say that someone can't theoretically come up with a sufficiently compelling counterexample. ~Andrew
Hi Andrew, On 30/07/2020 20:46, Andrew Cooper wrote: > On 30/07/2020 09:31, Paul Durrant wrote: > In between these two hypercalls from the utility, the guest can do a > v1=>v2 or v2=>v1 switch and make the resource spontaneously appear or > disappear. This can only happen on platform where grant-table v2 is enabled. Where this is not enabled (e.g Arm), then I think we want to return 0 as there is nothing to map. Cheers,
On 30.07.2020 21:46, Andrew Cooper wrote: > On 30/07/2020 09:31, Paul Durrant wrote: >>> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Andrew Cooper >>> Sent: 28 July 2020 12:37 >>> >>> --- a/xen/common/grant_table.c >>> +++ b/xen/common/grant_table.c >>> @@ -4013,6 +4013,25 @@ static int gnttab_get_shared_frame_mfn(struct domain *d, >>> return 0; >>> } >>> >>> +unsigned int gnttab_resource_max_frames(struct domain *d, unsigned int id) >>> +{ >>> + unsigned int nr = 0; >>> + >>> + /* Don't need the grant lock. This limit is fixed at domain create time. */ >>> + switch ( id ) >>> + { >>> + case XENMEM_resource_grant_table_id_shared: >>> + nr = d->grant_table->max_grant_frames; >>> + break; >>> + >>> + case XENMEM_resource_grant_table_id_status: >>> + nr = grant_to_status_frames(d->grant_table->max_grant_frames); >> Two uses of d->grant_table, so perhaps define a stack variable for it? > > Can do. > >> Also, should you not make sure 0 is returned in the case of a v1 table? > > This was the case specifically discussed in the commit message, but > perhaps it needs expanding. > > Doing so would be buggy. > > Some utility is going to query the resource size, and then try to map it > (if it doesn't blindly know the size and/or subset it cares about already). > > In between these two hypercalls from the utility, the guest can do a > v1=>v2 or v2=>v1 switch and make the resource spontaneously appear or > disappear. > > The only case where we can know for certain whether the resource is > available is when we're in the map hypercall. Therefore, userspace has > to be able to get to the map call if there is potentially a resource > available. > > The semantics of the size call are really "this resource might exist, > and if it does, this is how large it is". With you deriving from d->grant_table->max_grant_frames, this approach would imply that by obtaining a mapping the grant tables will get grown to their permitted maximum, no matter whether as much is actually needed by the guest. If this is indeed the intention, then we could as well set up maximum grant structures right at domain creation. Not something I would favor, but anyway... Jan
On 28.07.2020 13:37, Andrew Cooper wrote: > @@ -1026,19 +1047,6 @@ static int acquire_resource( > if ( xmar.pad != 0 ) > return -EINVAL; > > - if ( guest_handle_is_null(xmar.frame_list) ) > - { > - if ( xmar.nr_frames ) > - return -EINVAL; > - > - xmar.nr_frames = ARRAY_SIZE(mfn_list); > - > - if ( __copy_field_to_guest(arg, &xmar, nr_frames) ) > - return -EFAULT; > - > - return 0; > - } > - > if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) ) > return -E2BIG; While arguably minor, the error code in the null-handle case would imo better be the same, no matter how big xmar.nr_frames is. Jan
On 31/07/2020 15:44, Jan Beulich wrote: > On 28.07.2020 13:37, Andrew Cooper wrote: >> @@ -1026,19 +1047,6 @@ static int acquire_resource( >> if ( xmar.pad != 0 ) >> return -EINVAL; >> >> - if ( guest_handle_is_null(xmar.frame_list) ) >> - { >> - if ( xmar.nr_frames ) >> - return -EINVAL; >> - >> - xmar.nr_frames = ARRAY_SIZE(mfn_list); >> - >> - if ( __copy_field_to_guest(arg, &xmar, nr_frames) ) >> - return -EFAULT; >> - >> - return 0; >> - } >> - >> if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) ) >> return -E2BIG; > While arguably minor, the error code in the null-handle case > would imo better be the same, no matter how big xmar.nr_frames > is. This clause doesn't survive the fixes to batching. Given how broken this infrastructure is, I'm not concerned with transient differences in error codes for users which will ultimately fail anyway. ~Andrew
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 82bc676553..f73a90a2ab 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4600,6 +4600,26 @@ int xenmem_add_to_physmap_one( return rc; } +unsigned int arch_resource_max_frames( + struct domain *d, unsigned int type, unsigned int id) +{ + unsigned int nr = 0; + + switch ( type ) + { +#ifdef CONFIG_HVM + case XENMEM_resource_ioreq_server: + if ( !is_hvm_domain(d) ) + break; + /* One frame for the buf-ioreq ring, and one frame per 128 vcpus. */ + nr = 1 + DIV_ROUND_UP(d->max_vcpus * sizeof(struct ioreq), PAGE_SIZE); + break; +#endif + } + + return nr; +} + int arch_acquire_resource(struct domain *d, unsigned int type, unsigned int id, unsigned long frame, unsigned int nr_frames, xen_pfn_t mfn_list[]) diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c index 122d1e7596..0962fc7169 100644 --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -4013,6 +4013,25 @@ static int gnttab_get_shared_frame_mfn(struct domain *d, return 0; } +unsigned int gnttab_resource_max_frames(struct domain *d, unsigned int id) +{ + unsigned int nr = 0; + + /* Don't need the grant lock. This limit is fixed at domain create time. */ + switch ( id ) + { + case XENMEM_resource_grant_table_id_shared: + nr = d->grant_table->max_grant_frames; + break; + + case XENMEM_resource_grant_table_id_status: + nr = grant_to_status_frames(d->grant_table->max_grant_frames); + break; + } + + return nr; +} + int gnttab_acquire_resource( struct domain *d, unsigned int id, unsigned long frame, unsigned int nr_frames, xen_pfn_t mfn_list[]) diff --git a/xen/common/memory.c b/xen/common/memory.c index dc3a7248e3..21edabf9cc 100644 --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -1007,6 +1007,26 @@ static long xatp_permission_check(struct domain *d, unsigned int space) return xsm_add_to_physmap(XSM_TARGET, current->domain, d); } +/* + * Return 0 on any kind of error. Caller converts to -EINVAL. + * + * All nonzero values should be repeatable (i.e. derived from some fixed + * proerty of the domain), and describe the full resource (i.e. mapping the + * result of this call will be the entire resource). + */ +static unsigned int resource_max_frames(struct domain *d, + unsigned int type, unsigned int id) +{ + switch ( type ) + { + case XENMEM_resource_grant_table: + return gnttab_resource_max_frames(d, id); + + default: + return arch_resource_max_frames(d, type, id); + } +} + static int acquire_resource( XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg) { @@ -1018,6 +1038,7 @@ static int acquire_resource( * use-cases then per-CPU arrays or heap allocations may be required. */ xen_pfn_t mfn_list[32]; + unsigned int max_frames; int rc; if ( copy_from_guest(&xmar, arg, 1) ) @@ -1026,19 +1047,6 @@ static int acquire_resource( if ( xmar.pad != 0 ) return -EINVAL; - if ( guest_handle_is_null(xmar.frame_list) ) - { - if ( xmar.nr_frames ) - return -EINVAL; - - xmar.nr_frames = ARRAY_SIZE(mfn_list); - - if ( __copy_field_to_guest(arg, &xmar, nr_frames) ) - return -EFAULT; - - return 0; - } - if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) ) return -E2BIG; @@ -1050,6 +1058,27 @@ static int acquire_resource( if ( rc ) goto out; + max_frames = resource_max_frames(d, xmar.type, xmar.id); + + rc = -EINVAL; + if ( !max_frames ) + goto out; + + if ( guest_handle_is_null(xmar.frame_list) ) + { + if ( xmar.nr_frames ) + goto out; + + xmar.nr_frames = max_frames; + + rc = -EFAULT; + if ( __copy_field_to_guest(arg, &xmar, nr_frames) ) + goto out; + + rc = 0; + goto out; + } + switch ( xmar.type ) { case XENMEM_resource_grant_table: diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h index 7e74996053..b0caf372a8 100644 --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -649,6 +649,9 @@ static inline bool arch_mfn_in_directmap(unsigned long mfn) return mfn <= (virt_to_mfn(eva - 1) + 1); } +unsigned int arch_resource_max_frames(struct domain *d, + unsigned int type, unsigned int id); + int arch_acquire_resource(struct domain *d, unsigned int type, unsigned int id, unsigned long frame, unsigned int nr_frames, xen_pfn_t mfn_list[]); diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h index 21057ed78e..cea88cf40c 100644 --- a/xen/include/public/memory.h +++ b/xen/include/public/memory.h @@ -639,10 +639,18 @@ struct xen_mem_acquire_resource { #define XENMEM_resource_grant_table_id_status 1 /* - * IN/OUT - As an IN parameter number of frames of the resource - * to be mapped. However, if the specified value is 0 and - * frame_list is NULL then this field will be set to the - * maximum value supported by the implementation on return. + * IN/OUT + * + * As an IN parameter number of frames of the resource to be mapped. + * + * When frame_list is NULL and nr_frames is 0, this is interpreted as a + * request for the size of the resource, which shall be returned in the + * nr_frames field. + * + * The size of a resource will never be zero, but a nonzero result doesn't + * guarentee that a subsequent mapping request will be successful. There + * are further type/id specific constraints which may change between the + * two calls. */ uint32_t nr_frames; uint32_t pad; diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h index 5a2c75b880..bae4d79623 100644 --- a/xen/include/xen/grant_table.h +++ b/xen/include/xen/grant_table.h @@ -57,6 +57,8 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref, int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn, mfn_t *mfn); +unsigned int gnttab_resource_max_frames(struct domain *d, unsigned int id); + int gnttab_acquire_resource( struct domain *d, unsigned int id, unsigned long frame, unsigned int nr_frames, xen_pfn_t mfn_list[]); @@ -93,6 +95,12 @@ static inline int gnttab_map_frame(struct domain *d, unsigned long idx, return -EINVAL; } +static inline unsigned int gnttab_resource_max_frames( + struct domain *d, unsigned int id) +{ + return 0; +} + static inline int gnttab_acquire_resource( struct domain *d, unsigned int id, unsigned long frame, unsigned int nr_frames, xen_pfn_t mfn_list[]) diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index 1b2c1f6b32..c184dc1db1 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -686,6 +686,12 @@ static inline void put_page_alloc_ref(struct page_info *page) } #ifndef CONFIG_ARCH_ACQUIRE_RESOURCE +static inline unsigned int arch_resource_max_frames( + struct domain *d, unsigned int type, unsigned int id) +{ + return 0; +} + static inline int arch_acquire_resource( struct domain *d, unsigned int type, unsigned int id, unsigned long frame, unsigned int nr_frames, xen_pfn_t mfn_list[])
Calling XENMEM_acquire_resource with a NULL frame_list is a request for the size of the resource, but the returned 32 is bogus. If someone tries to follow it for XENMEM_resource_ioreq_server, the acquire call will fail as IOREQ servers currently top out at 2 frames, and it is only half the size of the default grant table limit for guests. Also, no users actually request a resource size, because it was never wired up in the sole implemenation of resource aquisition in Linux. Introduce a new resource_max_frames() to calculate the size of a resource, and implement it the IOREQ and grant subsystems. It is impossible to guarentee that a mapping call following a successful size call will succedd (e.g. The target IOREQ server gets destroyed, or the domain switches from grant v2 to v1). Document the restriction, and use the flexibility to simplify the paths to be lockless. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: George Dunlap <George.Dunlap@eu.citrix.com> CC: Ian Jackson <ian.jackson@citrix.com> CC: Jan Beulich <JBeulich@suse.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Wei Liu <wl@xen.org> CC: Julien Grall <julien@xen.org> CC: Paul Durrant <paul@xen.org> CC: Michał Leszczyński <michal.leszczynski@cert.pl> CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl> --- xen/arch/x86/mm.c | 20 ++++++++++++++++ xen/common/grant_table.c | 19 +++++++++++++++ xen/common/memory.c | 55 +++++++++++++++++++++++++++++++++---------- xen/include/asm-x86/mm.h | 3 +++ xen/include/public/memory.h | 16 +++++++++---- xen/include/xen/grant_table.h | 8 +++++++ xen/include/xen/mm.h | 6 +++++ 7 files changed, 110 insertions(+), 17 deletions(-)