diff mbox series

[XEN,v14,2/5] x86/pvh: Allow (un)map_pirq when dom0 is PVH

Message ID 20240903070424.982218-3-Jiqian.Chen@amd.com (mailing list archive)
State Superseded
Headers show
Series Support device passthrough when dom0 is PVH on Xen | expand

Commit Message

Chen, Jiqian Sept. 3, 2024, 7:04 a.m. UTC
When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
xc_physdev_map_pirq map a pirq for passthrough devices.
In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
codes.

But it is fine to map interrupts through pirq to a HVM domain whose
XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
reference interrupts and it is just the way for the device model to
identify which interrupt should be mapped to which domain, however
has_pirq() is just to check if HVM domains route interrupts from
devices(emulated or passthrough) through event channel, so, the has_pirq()
check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.

So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
interrupt of a passthrough device can be successfully mapped to pirq for domU.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
v13->v14 changes:
Modified the commit message.

v12->v13 changes:
Removed the PHYSDEVOP_(un)map_pirq restriction check for pvh domU and added a corresponding description in the commit message.

v11->v12 changes:
Avoid using return, set error code instead when (un)map is not allowed.

v10->v11 changes:
Delete the judgment of "d==currd", so that we can prevent physdev_(un)map_pirq from being executed when domU has no pirq, instead of just preventing self-mapping.
And modify the description of the commit message accordingly.

v9->v10 changes:
Indent the comments above PHYSDEVOP_map_pirq according to the code style.

v8->v9 changes:
Add a comment above PHYSDEVOP_map_pirq to describe why need this hypercall.
Change "!is_pv_domain(d)" to "is_hvm_domain(d)", and "map.domid == DOMID_SELF" to "d == current->domian".

v7->v8 changes:
Add the domid check(domid == DOMID_SELF) to prevent self map when guest doesn't use pirq.
That check was missed in the previous version.

v6->v7 changes:
Nothing.

v5->v6 changes:
Nothing.

v4->v5 changes:
Move the check of self map_pirq to physdev.c, and change to check if the caller has PIRQ flag, and just break for PHYSDEVOP_(un)map_pirq in hvm_physdev_op.

v3->v4 changes:
add check to prevent PVH self map.

v2->v3 changes:
Du to changes in the implementation of the second patch on kernel side(that it will do setup_gsi and map_pirq when assigning a device to passthrough), add PHYSDEVOP_setup_gsi for PVH dom0, and we need to support self mapping.
---
 xen/arch/x86/hvm/hypercall.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jan Beulich Sept. 3, 2024, 7:42 a.m. UTC | #1
On 03.09.2024 09:04, Jiqian Chen wrote:
> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
> xc_physdev_map_pirq map a pirq for passthrough devices.
> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
> codes.
> 
> But it is fine to map interrupts through pirq to a HVM domain whose
> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
> reference interrupts and it is just the way for the device model to
> identify which interrupt should be mapped to which domain, however
> has_pirq() is just to check if HVM domains route interrupts from
> devices(emulated or passthrough) through event channel, so, the has_pirq()
> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
> 
> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
> interrupt of a passthrough device can be successfully mapped to pirq for domU.

As before: When you talk about just Dom0, ...

> --- a/xen/arch/x86/hvm/hypercall.c
> +++ b/xen/arch/x86/hvm/hypercall.c
> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>      {
>      case PHYSDEVOP_map_pirq:
>      case PHYSDEVOP_unmap_pirq:
> +        break;
> +
>      case PHYSDEVOP_eoi:
>      case PHYSDEVOP_irq_status_query:
>      case PHYSDEVOP_get_free_pirq:

... that ought to match the code. IOW you've again lost why it is okay(ish)
(or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
Similarly imo Dom0 using this on itself wants discussing.

As to my earlier comments regarding your commit message adjustments: I
forgot that the change had to be reverted. I'm sorry for that.

Jan
Chen, Jiqian Sept. 3, 2024, 7:58 a.m. UTC | #2
On 2024/9/3 15:42, Jan Beulich wrote:
> On 03.09.2024 09:04, Jiqian Chen wrote:
>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>> xc_physdev_map_pirq map a pirq for passthrough devices.
>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>> codes.
>>
>> But it is fine to map interrupts through pirq to a HVM domain whose
>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
>> reference interrupts and it is just the way for the device model to
>> identify which interrupt should be mapped to which domain, however
>> has_pirq() is just to check if HVM domains route interrupts from
>> devices(emulated or passthrough) through event channel, so, the has_pirq()
>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
>>
>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
> 
> As before: When you talk about just Dom0, ...
> 
>> --- a/xen/arch/x86/hvm/hypercall.c
>> +++ b/xen/arch/x86/hvm/hypercall.c
>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>      {
>>      case PHYSDEVOP_map_pirq:
>>      case PHYSDEVOP_unmap_pirq:
>> +        break;
>> +
>>      case PHYSDEVOP_eoi:
>>      case PHYSDEVOP_irq_status_query:
>>      case PHYSDEVOP_get_free_pirq:
> 
> ... that ought to match the code. IOW you've again lost why it is okay(ish)
> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
> Similarly imo Dom0 using this on itself wants discussing.
Do you mean I need to talk about why permit this op for all HVM and  where can prevent PVH domain calling this for self-mapping, not only dom0?

> 
> As to my earlier comments regarding your commit message adjustments: I
> forgot that the change had to be reverted. I'm sorry for that.
Which change? Do I need to change the codes?

> 
> Jan
Jan Beulich Sept. 3, 2024, 9:25 a.m. UTC | #3
On 03.09.2024 09:58, Chen, Jiqian wrote:
> On 2024/9/3 15:42, Jan Beulich wrote:
>> On 03.09.2024 09:04, Jiqian Chen wrote:
>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>> codes.
>>>
>>> But it is fine to map interrupts through pirq to a HVM domain whose
>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
>>> reference interrupts and it is just the way for the device model to
>>> identify which interrupt should be mapped to which domain, however
>>> has_pirq() is just to check if HVM domains route interrupts from
>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
>>>
>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
>>
>> As before: When you talk about just Dom0, ...
>>
>>> --- a/xen/arch/x86/hvm/hypercall.c
>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>      {
>>>      case PHYSDEVOP_map_pirq:
>>>      case PHYSDEVOP_unmap_pirq:
>>> +        break;
>>> +
>>>      case PHYSDEVOP_eoi:
>>>      case PHYSDEVOP_irq_status_query:
>>>      case PHYSDEVOP_get_free_pirq:
>>
>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
>> Similarly imo Dom0 using this on itself wants discussing.
> Do you mean I need to talk about why permit this op for all HVM

You don't need to invent reasons, but it needs making clear that wider than
necessary (for your purpose) exposure is at least not going to be a problem.

> and  where can prevent PVH domain calling this for self-mapping, not only dom0?

Aiui use on itself is limited to Dom0, so only that would need clarifying
(along the lines of the above, i.e. that/why it is not a problem). For
has_pirq() domains use on oneself was already permitted before.

Jan
Chen, Jiqian Sept. 3, 2024, 10:53 a.m. UTC | #4
On 2024/9/3 17:25, Jan Beulich wrote:
> On 03.09.2024 09:58, Chen, Jiqian wrote:
>> On 2024/9/3 15:42, Jan Beulich wrote:
>>> On 03.09.2024 09:04, Jiqian Chen wrote:
>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>> codes.
>>>>
>>>> But it is fine to map interrupts through pirq to a HVM domain whose
>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
>>>> reference interrupts and it is just the way for the device model to
>>>> identify which interrupt should be mapped to which domain, however
>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
>>>>
>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
>>>
>>> As before: When you talk about just Dom0, ...
>>>
>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>      {
>>>>      case PHYSDEVOP_map_pirq:
>>>>      case PHYSDEVOP_unmap_pirq:
>>>> +        break;
>>>> +
>>>>      case PHYSDEVOP_eoi:
>>>>      case PHYSDEVOP_irq_status_query:
>>>>      case PHYSDEVOP_get_free_pirq:
>>>
>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
>>> Similarly imo Dom0 using this on itself wants discussing.
>> Do you mean I need to talk about why permit this op for all HVM
> 
> You don't need to invent reasons, but it needs making clear that wider than
> necessary (for your purpose) exposure is at least not going to be a problem.
> 
>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
> 
> Aiui use on itself is limited to Dom0, so only that would need clarifying
> (along the lines of the above, i.e. that/why it is not a problem). For
> has_pirq() domains use on oneself was already permitted before.

Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
{
x86/pvh: Allow (un)map_pirq when dom0 is PVH

Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
xc_physdev_map_pirq map a pirq for passthrough devices.
In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
codes.

To solve above problem, need to remove the chack has_pirq() for that
situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
what the problem need.
So, clarify below:

For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
interrupts through pirq for them. Because pirq field is used as a way to
reference interrupts and it is just the way for the device model to
identify which interrupt should be mapped to which domain, however
has_pirq() is just to check if HVM domains route interrupts from
devices(emulated or passthrough) through event channel, so, remove
has_pirq() check has no impact on HVM domUs.

For PVH domUs that performs such an operation will fail at the check
xsm_map_dedomain_pirq() in physdev_map-nirq().

For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
it also has no impact.
}

> 
> Jan
Jan Beulich Sept. 3, 2024, 2:14 p.m. UTC | #5
On 03.09.2024 12:53, Chen, Jiqian wrote:
> On 2024/9/3 17:25, Jan Beulich wrote:
>> On 03.09.2024 09:58, Chen, Jiqian wrote:
>>> On 2024/9/3 15:42, Jan Beulich wrote:
>>>> On 03.09.2024 09:04, Jiqian Chen wrote:
>>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
>>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>> codes.
>>>>>
>>>>> But it is fine to map interrupts through pirq to a HVM domain whose
>>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
>>>>> reference interrupts and it is just the way for the device model to
>>>>> identify which interrupt should be mapped to which domain, however
>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
>>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
>>>>>
>>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
>>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
>>>>
>>>> As before: When you talk about just Dom0, ...
>>>>
>>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>      {
>>>>>      case PHYSDEVOP_map_pirq:
>>>>>      case PHYSDEVOP_unmap_pirq:
>>>>> +        break;
>>>>> +
>>>>>      case PHYSDEVOP_eoi:
>>>>>      case PHYSDEVOP_irq_status_query:
>>>>>      case PHYSDEVOP_get_free_pirq:
>>>>
>>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
>>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
>>>> Similarly imo Dom0 using this on itself wants discussing.
>>> Do you mean I need to talk about why permit this op for all HVM
>>
>> You don't need to invent reasons, but it needs making clear that wider than
>> necessary (for your purpose) exposure is at least not going to be a problem.
>>
>>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
>>
>> Aiui use on itself is limited to Dom0, so only that would need clarifying
>> (along the lines of the above, i.e. that/why it is not a problem). For
>> has_pirq() domains use on oneself was already permitted before.
> 
> Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
> {
> x86/pvh: Allow (un)map_pirq when dom0 is PVH
> 
> Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
> code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
> xc_physdev_map_pirq map a pirq for passthrough devices.
> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
> codes.
> 
> To solve above problem, need to remove the chack has_pirq() for that
> situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
> adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
> what the problem need.
> So, clarify below:
> 
> For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
> interrupts through pirq for them. Because pirq field is used as a way to
> reference interrupts and it is just the way for the device model to
> identify which interrupt should be mapped to which domain, however
> has_pirq() is just to check if HVM domains route interrupts from
> devices(emulated or passthrough) through event channel, so, remove
> has_pirq() check has no impact on HVM domUs.
> 
> For PVH domUs that performs such an operation will fail at the check
> xsm_map_dedomain_pirq() in physdev_map-nirq().
> 
> For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
> it also has no impact.
> }

This is better than what you had before, and I don't really fancy re-
writing the description effectively from scratch. So let's just go from
there. As to the "excess" permission for the sub-ops: The main thing I'm
after is that it be clarified that we're not going to introduce any
security issues here. That requires auditing the code, and merely saying
"also has no impact" is a little too little for my taste. For Dom0 an
argument may be that it is overly powerful already anyway, even if for
PVH we're a little more strict than for PV (I think).

Jan
Stefano Stabellini Sept. 4, 2024, 1:43 a.m. UTC | #6
On Tue, 3 Sep 2024, Jan Beulich wrote:
> On 03.09.2024 12:53, Chen, Jiqian wrote:
> > On 2024/9/3 17:25, Jan Beulich wrote:
> >> On 03.09.2024 09:58, Chen, Jiqian wrote:
> >>> On 2024/9/3 15:42, Jan Beulich wrote:
> >>>> On 03.09.2024 09:04, Jiqian Chen wrote:
> >>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
> >>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
> >>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
> >>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
> >>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
> >>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
> >>>>> codes.
> >>>>>
> >>>>> But it is fine to map interrupts through pirq to a HVM domain whose
> >>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
> >>>>> reference interrupts and it is just the way for the device model to
> >>>>> identify which interrupt should be mapped to which domain, however
> >>>>> has_pirq() is just to check if HVM domains route interrupts from
> >>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
> >>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
> >>>>>
> >>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
> >>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
> >>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
> >>>>
> >>>> As before: When you talk about just Dom0, ...
> >>>>
> >>>>> --- a/xen/arch/x86/hvm/hypercall.c
> >>>>> +++ b/xen/arch/x86/hvm/hypercall.c
> >>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> >>>>>      {
> >>>>>      case PHYSDEVOP_map_pirq:
> >>>>>      case PHYSDEVOP_unmap_pirq:
> >>>>> +        break;
> >>>>> +
> >>>>>      case PHYSDEVOP_eoi:
> >>>>>      case PHYSDEVOP_irq_status_query:
> >>>>>      case PHYSDEVOP_get_free_pirq:
> >>>>
> >>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
> >>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
> >>>> Similarly imo Dom0 using this on itself wants discussing.
> >>> Do you mean I need to talk about why permit this op for all HVM
> >>
> >> You don't need to invent reasons, but it needs making clear that wider than
> >> necessary (for your purpose) exposure is at least not going to be a problem.
> >>
> >>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
> >>
> >> Aiui use on itself is limited to Dom0, so only that would need clarifying
> >> (along the lines of the above, i.e. that/why it is not a problem). For
> >> has_pirq() domains use on oneself was already permitted before.
> > 
> > Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
> > {
> > x86/pvh: Allow (un)map_pirq when dom0 is PVH
> > 
> > Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
> > code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
> > xc_physdev_map_pirq map a pirq for passthrough devices.
> > In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
> > has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
> > so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
> > codes.
> > 
> > To solve above problem, need to remove the chack has_pirq() for that
> > situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
> > adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
> > what the problem need.
> > So, clarify below:
> > 
> > For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
> > interrupts through pirq for them. Because pirq field is used as a way to
> > reference interrupts and it is just the way for the device model to
> > identify which interrupt should be mapped to which domain, however
> > has_pirq() is just to check if HVM domains route interrupts from
> > devices(emulated or passthrough) through event channel, so, remove
> > has_pirq() check has no impact on HVM domUs.
> > 
> > For PVH domUs that performs such an operation will fail at the check
> > xsm_map_dedomain_pirq() in physdev_map-nirq().
> > 
> > For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
> > it also has no impact.
> > }
> 
> This is better than what you had before, and I don't really fancy re-
> writing the description effectively from scratch. So let's just go from
> there. As to the "excess" permission for the sub-ops: The main thing I'm
> after is that it be clarified that we're not going to introduce any
> security issues here. That requires auditing the code, and merely saying
> "also has no impact" is a little too little for my taste. For Dom0 an
> argument may be that it is overly powerful already anyway, even if for
> PVH we're a little more strict than for PV (I think).

Hi Jan, for clarity and to make this actionable, are you suggesting to
clarify the commit message by adding wording around "Dom0 is overly
powerful already anyway so it is OK so this is OK" ?

There are typically no restrictions in terms of functionalities offered
to Dom0 as Dom0 can destroy other VMs and reboot the platform.
Jan Beulich Sept. 4, 2024, 6:04 a.m. UTC | #7
On 04.09.2024 03:43, Stefano Stabellini wrote:
> On Tue, 3 Sep 2024, Jan Beulich wrote:
>> On 03.09.2024 12:53, Chen, Jiqian wrote:
>>> On 2024/9/3 17:25, Jan Beulich wrote:
>>>> On 03.09.2024 09:58, Chen, Jiqian wrote:
>>>>> On 2024/9/3 15:42, Jan Beulich wrote:
>>>>>> On 03.09.2024 09:04, Jiqian Chen wrote:
>>>>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
>>>>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>>>> codes.
>>>>>>>
>>>>>>> But it is fine to map interrupts through pirq to a HVM domain whose
>>>>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
>>>>>>> reference interrupts and it is just the way for the device model to
>>>>>>> identify which interrupt should be mapped to which domain, however
>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
>>>>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
>>>>>>>
>>>>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>>>>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
>>>>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
>>>>>>
>>>>>> As before: When you talk about just Dom0, ...
>>>>>>
>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>>      {
>>>>>>>      case PHYSDEVOP_map_pirq:
>>>>>>>      case PHYSDEVOP_unmap_pirq:
>>>>>>> +        break;
>>>>>>> +
>>>>>>>      case PHYSDEVOP_eoi:
>>>>>>>      case PHYSDEVOP_irq_status_query:
>>>>>>>      case PHYSDEVOP_get_free_pirq:
>>>>>>
>>>>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
>>>>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
>>>>>> Similarly imo Dom0 using this on itself wants discussing.
>>>>> Do you mean I need to talk about why permit this op for all HVM
>>>>
>>>> You don't need to invent reasons, but it needs making clear that wider than
>>>> necessary (for your purpose) exposure is at least not going to be a problem.
>>>>
>>>>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
>>>>
>>>> Aiui use on itself is limited to Dom0, so only that would need clarifying
>>>> (along the lines of the above, i.e. that/why it is not a problem). For
>>>> has_pirq() domains use on oneself was already permitted before.
>>>
>>> Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
>>> {
>>> x86/pvh: Allow (un)map_pirq when dom0 is PVH
>>>
>>> Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
>>> code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>> codes.
>>>
>>> To solve above problem, need to remove the chack has_pirq() for that
>>> situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
>>> adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
>>> what the problem need.
>>> So, clarify below:
>>>
>>> For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
>>> interrupts through pirq for them. Because pirq field is used as a way to
>>> reference interrupts and it is just the way for the device model to
>>> identify which interrupt should be mapped to which domain, however
>>> has_pirq() is just to check if HVM domains route interrupts from
>>> devices(emulated or passthrough) through event channel, so, remove
>>> has_pirq() check has no impact on HVM domUs.
>>>
>>> For PVH domUs that performs such an operation will fail at the check
>>> xsm_map_dedomain_pirq() in physdev_map-nirq().
>>>
>>> For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
>>> it also has no impact.
>>> }
>>
>> This is better than what you had before, and I don't really fancy re-
>> writing the description effectively from scratch. So let's just go from
>> there. As to the "excess" permission for the sub-ops: The main thing I'm
>> after is that it be clarified that we're not going to introduce any
>> security issues here. That requires auditing the code, and merely saying
>> "also has no impact" is a little too little for my taste. For Dom0 an
>> argument may be that it is overly powerful already anyway, even if for
>> PVH we're a little more strict than for PV (I think).
> 
> Hi Jan, for clarity and to make this actionable, are you suggesting to
> clarify the commit message by adding wording around "Dom0 is overly
> powerful already anyway so it is OK so this is OK" ?

Yes, perhaps with "deemed" added. And text for DomU-s similarly extended,
as the pointing at the XSM check is presently incomplete (stubdom-s can
pass that check, after all, as can de-priv qemu running in Dom0).

Jan
Chen, Jiqian Sept. 5, 2024, 6:45 a.m. UTC | #8
HI,

On 2024/9/4 14:04, Jan Beulich wrote:
> On 04.09.2024 03:43, Stefano Stabellini wrote:
>> On Tue, 3 Sep 2024, Jan Beulich wrote:
>>> On 03.09.2024 12:53, Chen, Jiqian wrote:
>>>> On 2024/9/3 17:25, Jan Beulich wrote:
>>>>> On 03.09.2024 09:58, Chen, Jiqian wrote:
>>>>>> On 2024/9/3 15:42, Jan Beulich wrote:
>>>>>>> On 03.09.2024 09:04, Jiqian Chen wrote:
>>>>>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
>>>>>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>>>>> codes.
>>>>>>>>
>>>>>>>> But it is fine to map interrupts through pirq to a HVM domain whose
>>>>>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
>>>>>>>> reference interrupts and it is just the way for the device model to
>>>>>>>> identify which interrupt should be mapped to which domain, however
>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
>>>>>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
>>>>>>>>
>>>>>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>>>>>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
>>>>>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
>>>>>>>
>>>>>>> As before: When you talk about just Dom0, ...
>>>>>>>
>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>>>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>>>      {
>>>>>>>>      case PHYSDEVOP_map_pirq:
>>>>>>>>      case PHYSDEVOP_unmap_pirq:
>>>>>>>> +        break;
>>>>>>>> +
>>>>>>>>      case PHYSDEVOP_eoi:
>>>>>>>>      case PHYSDEVOP_irq_status_query:
>>>>>>>>      case PHYSDEVOP_get_free_pirq:
>>>>>>>
>>>>>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
>>>>>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
>>>>>>> Similarly imo Dom0 using this on itself wants discussing.
>>>>>> Do you mean I need to talk about why permit this op for all HVM
>>>>>
>>>>> You don't need to invent reasons, but it needs making clear that wider than
>>>>> necessary (for your purpose) exposure is at least not going to be a problem.
>>>>>
>>>>>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
>>>>>
>>>>> Aiui use on itself is limited to Dom0, so only that would need clarifying
>>>>> (along the lines of the above, i.e. that/why it is not a problem). For
>>>>> has_pirq() domains use on oneself was already permitted before.
>>>>
>>>> Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
>>>> {
>>>> x86/pvh: Allow (un)map_pirq when dom0 is PVH
>>>>
>>>> Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
>>>> code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>> codes.
>>>>
>>>> To solve above problem, need to remove the chack has_pirq() for that
>>>> situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
>>>> adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
>>>> what the problem need.
>>>> So, clarify below:
>>>>
>>>> For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
>>>> interrupts through pirq for them. Because pirq field is used as a way to
>>>> reference interrupts and it is just the way for the device model to
>>>> identify which interrupt should be mapped to which domain, however
>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>> devices(emulated or passthrough) through event channel, so, remove
>>>> has_pirq() check has no impact on HVM domUs.
>>>>
>>>> For PVH domUs that performs such an operation will fail at the check
>>>> xsm_map_dedomain_pirq() in physdev_map-nirq().
>>>>
>>>> For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
>>>> it also has no impact.
>>>> }
>>>
>>> This is better than what you had before, and I don't really fancy re-
>>> writing the description effectively from scratch. So let's just go from
>>> there. As to the "excess" permission for the sub-ops: The main thing I'm
>>> after is that it be clarified that we're not going to introduce any
>>> security issues here. That requires auditing the code, and merely saying
>>> "also has no impact" is a little too little for my taste. For Dom0 an
>>> argument may be that it is overly powerful already anyway, even if for
>>> PVH we're a little more strict than for PV (I think).
>>
>> Hi Jan, for clarity and to make this actionable, are you suggesting to
>> clarify the commit message by adding wording around "Dom0 is overly
>> powerful already anyway so it is OK so this is OK" ?
> 
> Yes, perhaps with "deemed" added. 
OK, for PVH dom0, I will change to " Dom0 is deemed overly powerful already anyway, so it is OK "

> And text for DomU-s similarly extended, as the pointing at the XSM check is presently incomplete (stubdom-s can
> pass that check, after all, as can de-priv qemu running in Dom0).
So sorry, I know so little about this, I can't explain these situations, could you tell me how to describe or help me write a paragraph?

> 
> Jan
Jan Beulich Sept. 5, 2024, 9:44 a.m. UTC | #9
On 05.09.2024 08:45, Chen, Jiqian wrote:
> HI,
> 
> On 2024/9/4 14:04, Jan Beulich wrote:
>> On 04.09.2024 03:43, Stefano Stabellini wrote:
>>> On Tue, 3 Sep 2024, Jan Beulich wrote:
>>>> On 03.09.2024 12:53, Chen, Jiqian wrote:
>>>>> On 2024/9/3 17:25, Jan Beulich wrote:
>>>>>> On 03.09.2024 09:58, Chen, Jiqian wrote:
>>>>>>> On 2024/9/3 15:42, Jan Beulich wrote:
>>>>>>>> On 03.09.2024 09:04, Jiqian Chen wrote:
>>>>>>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
>>>>>>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>>>>>> codes.
>>>>>>>>>
>>>>>>>>> But it is fine to map interrupts through pirq to a HVM domain whose
>>>>>>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
>>>>>>>>> reference interrupts and it is just the way for the device model to
>>>>>>>>> identify which interrupt should be mapped to which domain, however
>>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>>>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
>>>>>>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
>>>>>>>>>
>>>>>>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>>>>>>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
>>>>>>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
>>>>>>>>
>>>>>>>> As before: When you talk about just Dom0, ...
>>>>>>>>
>>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>>>>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>>>>      {
>>>>>>>>>      case PHYSDEVOP_map_pirq:
>>>>>>>>>      case PHYSDEVOP_unmap_pirq:
>>>>>>>>> +        break;
>>>>>>>>> +
>>>>>>>>>      case PHYSDEVOP_eoi:
>>>>>>>>>      case PHYSDEVOP_irq_status_query:
>>>>>>>>>      case PHYSDEVOP_get_free_pirq:
>>>>>>>>
>>>>>>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
>>>>>>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
>>>>>>>> Similarly imo Dom0 using this on itself wants discussing.
>>>>>>> Do you mean I need to talk about why permit this op for all HVM
>>>>>>
>>>>>> You don't need to invent reasons, but it needs making clear that wider than
>>>>>> necessary (for your purpose) exposure is at least not going to be a problem.
>>>>>>
>>>>>>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
>>>>>>
>>>>>> Aiui use on itself is limited to Dom0, so only that would need clarifying
>>>>>> (along the lines of the above, i.e. that/why it is not a problem). For
>>>>>> has_pirq() domains use on oneself was already permitted before.
>>>>>
>>>>> Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
>>>>> {
>>>>> x86/pvh: Allow (un)map_pirq when dom0 is PVH
>>>>>
>>>>> Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
>>>>> code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>> codes.
>>>>>
>>>>> To solve above problem, need to remove the chack has_pirq() for that
>>>>> situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
>>>>> adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
>>>>> what the problem need.
>>>>> So, clarify below:
>>>>>
>>>>> For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
>>>>> interrupts through pirq for them. Because pirq field is used as a way to
>>>>> reference interrupts and it is just the way for the device model to
>>>>> identify which interrupt should be mapped to which domain, however
>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>> devices(emulated or passthrough) through event channel, so, remove
>>>>> has_pirq() check has no impact on HVM domUs.
>>>>>
>>>>> For PVH domUs that performs such an operation will fail at the check
>>>>> xsm_map_dedomain_pirq() in physdev_map-nirq().
>>>>>
>>>>> For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
>>>>> it also has no impact.
>>>>> }
>>>>
>>>> This is better than what you had before, and I don't really fancy re-
>>>> writing the description effectively from scratch. So let's just go from
>>>> there. As to the "excess" permission for the sub-ops: The main thing I'm
>>>> after is that it be clarified that we're not going to introduce any
>>>> security issues here. That requires auditing the code, and merely saying
>>>> "also has no impact" is a little too little for my taste. For Dom0 an
>>>> argument may be that it is overly powerful already anyway, even if for
>>>> PVH we're a little more strict than for PV (I think).
>>>
>>> Hi Jan, for clarity and to make this actionable, are you suggesting to
>>> clarify the commit message by adding wording around "Dom0 is overly
>>> powerful already anyway so it is OK so this is OK" ?
>>
>> Yes, perhaps with "deemed" added. 
> OK, for PVH dom0, I will change to " Dom0 is deemed overly powerful already anyway, so it is OK "

I don't mind the deemed as you add it, but the important place to add it
here is before "OK". I'm sorry, it didn't occur to me that after all the
prior discussion this earlier reply of mine could still be mis-interpreted.

>> And text for DomU-s similarly extended, as the pointing at the XSM check is presently incomplete (stubdom-s can
>> pass that check, after all, as can de-priv qemu running in Dom0).
> So sorry, I know so little about this, I can't explain these situations, could you tell me how to describe or help me write a paragraph?

I'm afraid that in order to make (propose) such a change you need to be
able to explain why it is okay to expose functionality beyond where it's
presently exposed. It's not just writing a new paragraph that's needed
here. You first need to _check_ that what you do is okay. And once you've
done that checking, you then summarize that in writing.

Jan
Stefano Stabellini Sept. 5, 2024, 10:51 p.m. UTC | #10
On Thu, 5 Sep 2024, Jan Beulich wrote:
> On 05.09.2024 08:45, Chen, Jiqian wrote:
> > HI,
> > 
> > On 2024/9/4 14:04, Jan Beulich wrote:
> >> On 04.09.2024 03:43, Stefano Stabellini wrote:
> >>> On Tue, 3 Sep 2024, Jan Beulich wrote:
> >>>> On 03.09.2024 12:53, Chen, Jiqian wrote:
> >>>>> On 2024/9/3 17:25, Jan Beulich wrote:
> >>>>>> On 03.09.2024 09:58, Chen, Jiqian wrote:
> >>>>>>> On 2024/9/3 15:42, Jan Beulich wrote:
> >>>>>>>> On 03.09.2024 09:04, Jiqian Chen wrote:
> >>>>>>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
> >>>>>>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
> >>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
> >>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
> >>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
> >>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
> >>>>>>>>> codes.
> >>>>>>>>>
> >>>>>>>>> But it is fine to map interrupts through pirq to a HVM domain whose
> >>>>>>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
> >>>>>>>>> reference interrupts and it is just the way for the device model to
> >>>>>>>>> identify which interrupt should be mapped to which domain, however
> >>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
> >>>>>>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
> >>>>>>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
> >>>>>>>>>
> >>>>>>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
> >>>>>>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
> >>>>>>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
> >>>>>>>>
> >>>>>>>> As before: When you talk about just Dom0, ...
> >>>>>>>>
> >>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
> >>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
> >>>>>>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> >>>>>>>>>      {
> >>>>>>>>>      case PHYSDEVOP_map_pirq:
> >>>>>>>>>      case PHYSDEVOP_unmap_pirq:
> >>>>>>>>> +        break;
> >>>>>>>>> +
> >>>>>>>>>      case PHYSDEVOP_eoi:
> >>>>>>>>>      case PHYSDEVOP_irq_status_query:
> >>>>>>>>>      case PHYSDEVOP_get_free_pirq:
> >>>>>>>>
> >>>>>>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
> >>>>>>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
> >>>>>>>> Similarly imo Dom0 using this on itself wants discussing.
> >>>>>>> Do you mean I need to talk about why permit this op for all HVM
> >>>>>>
> >>>>>> You don't need to invent reasons, but it needs making clear that wider than
> >>>>>> necessary (for your purpose) exposure is at least not going to be a problem.
> >>>>>>
> >>>>>>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
> >>>>>>
> >>>>>> Aiui use on itself is limited to Dom0, so only that would need clarifying
> >>>>>> (along the lines of the above, i.e. that/why it is not a problem). For
> >>>>>> has_pirq() domains use on oneself was already permitted before.
> >>>>>
> >>>>> Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
> >>>>> {
> >>>>> x86/pvh: Allow (un)map_pirq when dom0 is PVH
> >>>>>
> >>>>> Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
> >>>>> code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
> >>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
> >>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
> >>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
> >>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
> >>>>> codes.
> >>>>>
> >>>>> To solve above problem, need to remove the chack has_pirq() for that
> >>>>> situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
> >>>>> adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
> >>>>> what the problem need.
> >>>>> So, clarify below:
> >>>>>
> >>>>> For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
> >>>>> interrupts through pirq for them. Because pirq field is used as a way to
> >>>>> reference interrupts and it is just the way for the device model to
> >>>>> identify which interrupt should be mapped to which domain, however
> >>>>> has_pirq() is just to check if HVM domains route interrupts from
> >>>>> devices(emulated or passthrough) through event channel, so, remove
> >>>>> has_pirq() check has no impact on HVM domUs.
> >>>>>
> >>>>> For PVH domUs that performs such an operation will fail at the check
> >>>>> xsm_map_dedomain_pirq() in physdev_map-nirq().
> >>>>>
> >>>>> For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
> >>>>> it also has no impact.
> >>>>> }
> >>>>
> >>>> This is better than what you had before, and I don't really fancy re-
> >>>> writing the description effectively from scratch. So let's just go from
> >>>> there. As to the "excess" permission for the sub-ops: The main thing I'm
> >>>> after is that it be clarified that we're not going to introduce any
> >>>> security issues here. That requires auditing the code, and merely saying
> >>>> "also has no impact" is a little too little for my taste. For Dom0 an
> >>>> argument may be that it is overly powerful already anyway, even if for
> >>>> PVH we're a little more strict than for PV (I think).
> >>>
> >>> Hi Jan, for clarity and to make this actionable, are you suggesting to
> >>> clarify the commit message by adding wording around "Dom0 is overly
> >>> powerful already anyway so it is OK so this is OK" ?
> >>
> >> Yes, perhaps with "deemed" added. 
> > OK, for PVH dom0, I will change to " Dom0 is deemed overly powerful already anyway, so it is OK "
> 
> I don't mind the deemed as you add it, but the important place to add it
> here is before "OK". I'm sorry, it didn't occur to me that after all the
> prior discussion this earlier reply of mine could still be mis-interpreted.
> 
> >> And text for DomU-s similarly extended, as the pointing at the XSM check is presently incomplete (stubdom-s can
> >> pass that check, after all, as can de-priv qemu running in Dom0).
> > So sorry, I know so little about this, I can't explain these situations, could you tell me how to describe or help me write a paragraph?
> 
> I'm afraid that in order to make (propose) such a change you need to be
> able to explain why it is okay to expose functionality beyond where it's
> presently exposed. It's not just writing a new paragraph that's needed
> here. You first need to _check_ that what you do is okay. And once you've
> done that checking, you then summarize that in writing.
 

PHYSDEVOP_map_pirq ends up calling physdev_map_pirq which is protected
by:

    ret = xsm_map_domain_pirq(XSM_DM_PRIV, d);
    if ( ret )
        return ret;

Dom0 having permission to do PHYSDEVOP_map_pirq even without has_pirq is
fine. Device models are also OK because the code we are trying to enable
is in fact part of the device model. If someone were to run an HVM
stubdom they might need this patch as well.

If PHYSDEVOP_map_pirq is allowed, also PHYSDEVOP_unmap_pirq should be
allowed.

Is this explanation OK?
Jan Beulich Sept. 6, 2024, 6:26 a.m. UTC | #11
On 06.09.2024 00:51, Stefano Stabellini wrote:
> On Thu, 5 Sep 2024, Jan Beulich wrote:
>> On 05.09.2024 08:45, Chen, Jiqian wrote:
>>> HI,
>>>
>>> On 2024/9/4 14:04, Jan Beulich wrote:
>>>> On 04.09.2024 03:43, Stefano Stabellini wrote:
>>>>> On Tue, 3 Sep 2024, Jan Beulich wrote:
>>>>>> On 03.09.2024 12:53, Chen, Jiqian wrote:
>>>>>>> On 2024/9/3 17:25, Jan Beulich wrote:
>>>>>>>> On 03.09.2024 09:58, Chen, Jiqian wrote:
>>>>>>>>> On 2024/9/3 15:42, Jan Beulich wrote:
>>>>>>>>>> On 03.09.2024 09:04, Jiqian Chen wrote:
>>>>>>>>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
>>>>>>>>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>>>>>>>> codes.
>>>>>>>>>>>
>>>>>>>>>>> But it is fine to map interrupts through pirq to a HVM domain whose
>>>>>>>>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
>>>>>>>>>>> reference interrupts and it is just the way for the device model to
>>>>>>>>>>> identify which interrupt should be mapped to which domain, however
>>>>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>>>>>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
>>>>>>>>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
>>>>>>>>>>>
>>>>>>>>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>>>>>>>>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
>>>>>>>>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
>>>>>>>>>>
>>>>>>>>>> As before: When you talk about just Dom0, ...
>>>>>>>>>>
>>>>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>>>>>>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>>>>>>      {
>>>>>>>>>>>      case PHYSDEVOP_map_pirq:
>>>>>>>>>>>      case PHYSDEVOP_unmap_pirq:
>>>>>>>>>>> +        break;
>>>>>>>>>>> +
>>>>>>>>>>>      case PHYSDEVOP_eoi:
>>>>>>>>>>>      case PHYSDEVOP_irq_status_query:
>>>>>>>>>>>      case PHYSDEVOP_get_free_pirq:
>>>>>>>>>>
>>>>>>>>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
>>>>>>>>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
>>>>>>>>>> Similarly imo Dom0 using this on itself wants discussing.
>>>>>>>>> Do you mean I need to talk about why permit this op for all HVM
>>>>>>>>
>>>>>>>> You don't need to invent reasons, but it needs making clear that wider than
>>>>>>>> necessary (for your purpose) exposure is at least not going to be a problem.
>>>>>>>>
>>>>>>>>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
>>>>>>>>
>>>>>>>> Aiui use on itself is limited to Dom0, so only that would need clarifying
>>>>>>>> (along the lines of the above, i.e. that/why it is not a problem). For
>>>>>>>> has_pirq() domains use on oneself was already permitted before.
>>>>>>>
>>>>>>> Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
>>>>>>> {
>>>>>>> x86/pvh: Allow (un)map_pirq when dom0 is PVH
>>>>>>>
>>>>>>> Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
>>>>>>> code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>>>> codes.
>>>>>>>
>>>>>>> To solve above problem, need to remove the chack has_pirq() for that
>>>>>>> situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
>>>>>>> adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
>>>>>>> what the problem need.
>>>>>>> So, clarify below:
>>>>>>>
>>>>>>> For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
>>>>>>> interrupts through pirq for them. Because pirq field is used as a way to
>>>>>>> reference interrupts and it is just the way for the device model to
>>>>>>> identify which interrupt should be mapped to which domain, however
>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>>>> devices(emulated or passthrough) through event channel, so, remove
>>>>>>> has_pirq() check has no impact on HVM domUs.
>>>>>>>
>>>>>>> For PVH domUs that performs such an operation will fail at the check
>>>>>>> xsm_map_dedomain_pirq() in physdev_map-nirq().
>>>>>>>
>>>>>>> For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
>>>>>>> it also has no impact.
>>>>>>> }
>>>>>>
>>>>>> This is better than what you had before, and I don't really fancy re-
>>>>>> writing the description effectively from scratch. So let's just go from
>>>>>> there. As to the "excess" permission for the sub-ops: The main thing I'm
>>>>>> after is that it be clarified that we're not going to introduce any
>>>>>> security issues here. That requires auditing the code, and merely saying
>>>>>> "also has no impact" is a little too little for my taste. For Dom0 an
>>>>>> argument may be that it is overly powerful already anyway, even if for
>>>>>> PVH we're a little more strict than for PV (I think).
>>>>>
>>>>> Hi Jan, for clarity and to make this actionable, are you suggesting to
>>>>> clarify the commit message by adding wording around "Dom0 is overly
>>>>> powerful already anyway so it is OK so this is OK" ?
>>>>
>>>> Yes, perhaps with "deemed" added. 
>>> OK, for PVH dom0, I will change to " Dom0 is deemed overly powerful already anyway, so it is OK "
>>
>> I don't mind the deemed as you add it, but the important place to add it
>> here is before "OK". I'm sorry, it didn't occur to me that after all the
>> prior discussion this earlier reply of mine could still be mis-interpreted.
>>
>>>> And text for DomU-s similarly extended, as the pointing at the XSM check is presently incomplete (stubdom-s can
>>>> pass that check, after all, as can de-priv qemu running in Dom0).
>>> So sorry, I know so little about this, I can't explain these situations, could you tell me how to describe or help me write a paragraph?
>>
>> I'm afraid that in order to make (propose) such a change you need to be
>> able to explain why it is okay to expose functionality beyond where it's
>> presently exposed. It's not just writing a new paragraph that's needed
>> here. You first need to _check_ that what you do is okay. And once you've
>> done that checking, you then summarize that in writing.
>  
> 
> PHYSDEVOP_map_pirq ends up calling physdev_map_pirq which is protected
> by:
> 
>     ret = xsm_map_domain_pirq(XSM_DM_PRIV, d);
>     if ( ret )
>         return ret;
> 
> Dom0 having permission to do PHYSDEVOP_map_pirq even without has_pirq is
> fine. Device models are also OK because the code we are trying to enable
> is in fact part of the device model. If someone were to run an HVM
> stubdom they might need this patch as well.
> 
> If PHYSDEVOP_map_pirq is allowed, also PHYSDEVOP_unmap_pirq should be
> allowed.
> 
> Is this explanation OK?

This still solely focuses on why the functionality is wanted. There
continues to be nothing about the wider exposure actually being safe.

Jan
Stefano Stabellini Sept. 6, 2024, 11:34 p.m. UTC | #12
On Fri, 6 Sep 2024, Jan Beulich wrote:
> On 06.09.2024 00:51, Stefano Stabellini wrote:
> > On Thu, 5 Sep 2024, Jan Beulich wrote:
> >> On 05.09.2024 08:45, Chen, Jiqian wrote:
> >>> HI,
> >>>
> >>> On 2024/9/4 14:04, Jan Beulich wrote:
> >>>> On 04.09.2024 03:43, Stefano Stabellini wrote:
> >>>>> On Tue, 3 Sep 2024, Jan Beulich wrote:
> >>>>>> On 03.09.2024 12:53, Chen, Jiqian wrote:
> >>>>>>> On 2024/9/3 17:25, Jan Beulich wrote:
> >>>>>>>> On 03.09.2024 09:58, Chen, Jiqian wrote:
> >>>>>>>>> On 2024/9/3 15:42, Jan Beulich wrote:
> >>>>>>>>>> On 03.09.2024 09:04, Jiqian Chen wrote:
> >>>>>>>>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
> >>>>>>>>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
> >>>>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
> >>>>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
> >>>>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
> >>>>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
> >>>>>>>>>>> codes.
> >>>>>>>>>>>
> >>>>>>>>>>> But it is fine to map interrupts through pirq to a HVM domain whose
> >>>>>>>>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
> >>>>>>>>>>> reference interrupts and it is just the way for the device model to
> >>>>>>>>>>> identify which interrupt should be mapped to which domain, however
> >>>>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
> >>>>>>>>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
> >>>>>>>>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
> >>>>>>>>>>>
> >>>>>>>>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
> >>>>>>>>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
> >>>>>>>>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
> >>>>>>>>>>
> >>>>>>>>>> As before: When you talk about just Dom0, ...
> >>>>>>>>>>
> >>>>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
> >>>>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
> >>>>>>>>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> >>>>>>>>>>>      {
> >>>>>>>>>>>      case PHYSDEVOP_map_pirq:
> >>>>>>>>>>>      case PHYSDEVOP_unmap_pirq:
> >>>>>>>>>>> +        break;
> >>>>>>>>>>> +
> >>>>>>>>>>>      case PHYSDEVOP_eoi:
> >>>>>>>>>>>      case PHYSDEVOP_irq_status_query:
> >>>>>>>>>>>      case PHYSDEVOP_get_free_pirq:
> >>>>>>>>>>
> >>>>>>>>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
> >>>>>>>>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
> >>>>>>>>>> Similarly imo Dom0 using this on itself wants discussing.
> >>>>>>>>> Do you mean I need to talk about why permit this op for all HVM
> >>>>>>>>
> >>>>>>>> You don't need to invent reasons, but it needs making clear that wider than
> >>>>>>>> necessary (for your purpose) exposure is at least not going to be a problem.
> >>>>>>>>
> >>>>>>>>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
> >>>>>>>>
> >>>>>>>> Aiui use on itself is limited to Dom0, so only that would need clarifying
> >>>>>>>> (along the lines of the above, i.e. that/why it is not a problem). For
> >>>>>>>> has_pirq() domains use on oneself was already permitted before.
> >>>>>>>
> >>>>>>> Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
> >>>>>>> {
> >>>>>>> x86/pvh: Allow (un)map_pirq when dom0 is PVH
> >>>>>>>
> >>>>>>> Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
> >>>>>>> code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
> >>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
> >>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
> >>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
> >>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
> >>>>>>> codes.
> >>>>>>>
> >>>>>>> To solve above problem, need to remove the chack has_pirq() for that
> >>>>>>> situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
> >>>>>>> adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
> >>>>>>> what the problem need.
> >>>>>>> So, clarify below:
> >>>>>>>
> >>>>>>> For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
> >>>>>>> interrupts through pirq for them. Because pirq field is used as a way to
> >>>>>>> reference interrupts and it is just the way for the device model to
> >>>>>>> identify which interrupt should be mapped to which domain, however
> >>>>>>> has_pirq() is just to check if HVM domains route interrupts from
> >>>>>>> devices(emulated or passthrough) through event channel, so, remove
> >>>>>>> has_pirq() check has no impact on HVM domUs.
> >>>>>>>
> >>>>>>> For PVH domUs that performs such an operation will fail at the check
> >>>>>>> xsm_map_dedomain_pirq() in physdev_map-nirq().
> >>>>>>>
> >>>>>>> For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
> >>>>>>> it also has no impact.
> >>>>>>> }
> >>>>>>
> >>>>>> This is better than what you had before, and I don't really fancy re-
> >>>>>> writing the description effectively from scratch. So let's just go from
> >>>>>> there. As to the "excess" permission for the sub-ops: The main thing I'm
> >>>>>> after is that it be clarified that we're not going to introduce any
> >>>>>> security issues here. That requires auditing the code, and merely saying
> >>>>>> "also has no impact" is a little too little for my taste. For Dom0 an
> >>>>>> argument may be that it is overly powerful already anyway, even if for
> >>>>>> PVH we're a little more strict than for PV (I think).
> >>>>>
> >>>>> Hi Jan, for clarity and to make this actionable, are you suggesting to
> >>>>> clarify the commit message by adding wording around "Dom0 is overly
> >>>>> powerful already anyway so it is OK so this is OK" ?
> >>>>
> >>>> Yes, perhaps with "deemed" added. 
> >>> OK, for PVH dom0, I will change to " Dom0 is deemed overly powerful already anyway, so it is OK "
> >>
> >> I don't mind the deemed as you add it, but the important place to add it
> >> here is before "OK". I'm sorry, it didn't occur to me that after all the
> >> prior discussion this earlier reply of mine could still be mis-interpreted.
> >>
> >>>> And text for DomU-s similarly extended, as the pointing at the XSM check is presently incomplete (stubdom-s can
> >>>> pass that check, after all, as can de-priv qemu running in Dom0).
> >>> So sorry, I know so little about this, I can't explain these situations, could you tell me how to describe or help me write a paragraph?
> >>
> >> I'm afraid that in order to make (propose) such a change you need to be
> >> able to explain why it is okay to expose functionality beyond where it's
> >> presently exposed. It's not just writing a new paragraph that's needed
> >> here. You first need to _check_ that what you do is okay. And once you've
> >> done that checking, you then summarize that in writing.
> >  
> > 
> > PHYSDEVOP_map_pirq ends up calling physdev_map_pirq which is protected
> > by:
> > 
> >     ret = xsm_map_domain_pirq(XSM_DM_PRIV, d);
> >     if ( ret )
> >         return ret;
> > 
> > Dom0 having permission to do PHYSDEVOP_map_pirq even without has_pirq is
> > fine. Device models are also OK because the code we are trying to enable
> > is in fact part of the device model. If someone were to run an HVM
> > stubdom they might need this patch as well.
> > 
> > If PHYSDEVOP_map_pirq is allowed, also PHYSDEVOP_unmap_pirq should be
> > allowed.
> > 
> > Is this explanation OK?
> 
> This still solely focuses on why the functionality is wanted. There
> continues to be nothing about the wider exposure actually being safe.

I don't think I understand what you would like to be checked or
clarified...

The only wider exposure is to device models, and device models can do a
lot worse than mapping pirqs already. There is no wider exposure to
DomUs. Also PV device models can already do this.

But the above must be clear to you as well, so I am not sure what you
are looking for.
Jan Beulich Sept. 9, 2024, 8:56 a.m. UTC | #13
On 07.09.2024 01:34, Stefano Stabellini wrote:
> On Fri, 6 Sep 2024, Jan Beulich wrote:
>> On 06.09.2024 00:51, Stefano Stabellini wrote:
>>> On Thu, 5 Sep 2024, Jan Beulich wrote:
>>>> On 05.09.2024 08:45, Chen, Jiqian wrote:
>>>>> HI,
>>>>>
>>>>> On 2024/9/4 14:04, Jan Beulich wrote:
>>>>>> On 04.09.2024 03:43, Stefano Stabellini wrote:
>>>>>>> On Tue, 3 Sep 2024, Jan Beulich wrote:
>>>>>>>> On 03.09.2024 12:53, Chen, Jiqian wrote:
>>>>>>>>> On 2024/9/3 17:25, Jan Beulich wrote:
>>>>>>>>>> On 03.09.2024 09:58, Chen, Jiqian wrote:
>>>>>>>>>>> On 2024/9/3 15:42, Jan Beulich wrote:
>>>>>>>>>>>> On 03.09.2024 09:04, Jiqian Chen wrote:
>>>>>>>>>>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
>>>>>>>>>>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>>>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>>>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>>>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>>>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>>>>>>>>>> codes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But it is fine to map interrupts through pirq to a HVM domain whose
>>>>>>>>>>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
>>>>>>>>>>>>> reference interrupts and it is just the way for the device model to
>>>>>>>>>>>>> identify which interrupt should be mapped to which domain, however
>>>>>>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>>>>>>>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
>>>>>>>>>>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>>>>>>>>>>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
>>>>>>>>>>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
>>>>>>>>>>>>
>>>>>>>>>>>> As before: When you talk about just Dom0, ...
>>>>>>>>>>>>
>>>>>>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>>>>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>>>>>>>>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>>>>>>>>      {
>>>>>>>>>>>>>      case PHYSDEVOP_map_pirq:
>>>>>>>>>>>>>      case PHYSDEVOP_unmap_pirq:
>>>>>>>>>>>>> +        break;
>>>>>>>>>>>>> +
>>>>>>>>>>>>>      case PHYSDEVOP_eoi:
>>>>>>>>>>>>>      case PHYSDEVOP_irq_status_query:
>>>>>>>>>>>>>      case PHYSDEVOP_get_free_pirq:
>>>>>>>>>>>>
>>>>>>>>>>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
>>>>>>>>>>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
>>>>>>>>>>>> Similarly imo Dom0 using this on itself wants discussing.
>>>>>>>>>>> Do you mean I need to talk about why permit this op for all HVM
>>>>>>>>>>
>>>>>>>>>> You don't need to invent reasons, but it needs making clear that wider than
>>>>>>>>>> necessary (for your purpose) exposure is at least not going to be a problem.
>>>>>>>>>>
>>>>>>>>>>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
>>>>>>>>>>
>>>>>>>>>> Aiui use on itself is limited to Dom0, so only that would need clarifying
>>>>>>>>>> (along the lines of the above, i.e. that/why it is not a problem). For
>>>>>>>>>> has_pirq() domains use on oneself was already permitted before.
>>>>>>>>>
>>>>>>>>> Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
>>>>>>>>> {
>>>>>>>>> x86/pvh: Allow (un)map_pirq when dom0 is PVH
>>>>>>>>>
>>>>>>>>> Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
>>>>>>>>> code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>>>>>> codes.
>>>>>>>>>
>>>>>>>>> To solve above problem, need to remove the chack has_pirq() for that
>>>>>>>>> situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
>>>>>>>>> adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
>>>>>>>>> what the problem need.
>>>>>>>>> So, clarify below:
>>>>>>>>>
>>>>>>>>> For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
>>>>>>>>> interrupts through pirq for them. Because pirq field is used as a way to
>>>>>>>>> reference interrupts and it is just the way for the device model to
>>>>>>>>> identify which interrupt should be mapped to which domain, however
>>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>>>>>> devices(emulated or passthrough) through event channel, so, remove
>>>>>>>>> has_pirq() check has no impact on HVM domUs.
>>>>>>>>>
>>>>>>>>> For PVH domUs that performs such an operation will fail at the check
>>>>>>>>> xsm_map_dedomain_pirq() in physdev_map-nirq().
>>>>>>>>>
>>>>>>>>> For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
>>>>>>>>> it also has no impact.
>>>>>>>>> }
>>>>>>>>
>>>>>>>> This is better than what you had before, and I don't really fancy re-
>>>>>>>> writing the description effectively from scratch. So let's just go from
>>>>>>>> there. As to the "excess" permission for the sub-ops: The main thing I'm
>>>>>>>> after is that it be clarified that we're not going to introduce any
>>>>>>>> security issues here. That requires auditing the code, and merely saying
>>>>>>>> "also has no impact" is a little too little for my taste. For Dom0 an
>>>>>>>> argument may be that it is overly powerful already anyway, even if for
>>>>>>>> PVH we're a little more strict than for PV (I think).
>>>>>>>
>>>>>>> Hi Jan, for clarity and to make this actionable, are you suggesting to
>>>>>>> clarify the commit message by adding wording around "Dom0 is overly
>>>>>>> powerful already anyway so it is OK so this is OK" ?
>>>>>>
>>>>>> Yes, perhaps with "deemed" added. 
>>>>> OK, for PVH dom0, I will change to " Dom0 is deemed overly powerful already anyway, so it is OK "
>>>>
>>>> I don't mind the deemed as you add it, but the important place to add it
>>>> here is before "OK". I'm sorry, it didn't occur to me that after all the
>>>> prior discussion this earlier reply of mine could still be mis-interpreted.
>>>>
>>>>>> And text for DomU-s similarly extended, as the pointing at the XSM check is presently incomplete (stubdom-s can
>>>>>> pass that check, after all, as can de-priv qemu running in Dom0).
>>>>> So sorry, I know so little about this, I can't explain these situations, could you tell me how to describe or help me write a paragraph?
>>>>
>>>> I'm afraid that in order to make (propose) such a change you need to be
>>>> able to explain why it is okay to expose functionality beyond where it's
>>>> presently exposed. It's not just writing a new paragraph that's needed
>>>> here. You first need to _check_ that what you do is okay. And once you've
>>>> done that checking, you then summarize that in writing.
>>>  
>>>
>>> PHYSDEVOP_map_pirq ends up calling physdev_map_pirq which is protected
>>> by:
>>>
>>>     ret = xsm_map_domain_pirq(XSM_DM_PRIV, d);
>>>     if ( ret )
>>>         return ret;
>>>
>>> Dom0 having permission to do PHYSDEVOP_map_pirq even without has_pirq is
>>> fine. Device models are also OK because the code we are trying to enable
>>> is in fact part of the device model. If someone were to run an HVM
>>> stubdom they might need this patch as well.
>>>
>>> If PHYSDEVOP_map_pirq is allowed, also PHYSDEVOP_unmap_pirq should be
>>> allowed.
>>>
>>> Is this explanation OK?
>>
>> This still solely focuses on why the functionality is wanted. There
>> continues to be nothing about the wider exposure actually being safe.
> 
> I don't think I understand what you would like to be checked or
> clarified...
> 
> The only wider exposure is to device models, and device models can do a
> lot worse than mapping pirqs already. There is no wider exposure to
> DomUs. Also PV device models can already do this.

What do you mean by "worse"? I hope not "crash Xen"? And _that's_ what I
want to have assurance of, e.g. a PVH/HVM DM not suddenly bringing Xen
down, because these paths previously weren't accessible to them.

Jan
Roger Pau Monné Sept. 9, 2024, 10:04 a.m. UTC | #14
On Mon, Sep 09, 2024 at 10:56:07AM +0200, Jan Beulich wrote:
> On 07.09.2024 01:34, Stefano Stabellini wrote:
> > On Fri, 6 Sep 2024, Jan Beulich wrote:
> >> On 06.09.2024 00:51, Stefano Stabellini wrote:
> >>> On Thu, 5 Sep 2024, Jan Beulich wrote:
> >>>> On 05.09.2024 08:45, Chen, Jiqian wrote:
> >>>>> HI,
> >>>>>
> >>>>> On 2024/9/4 14:04, Jan Beulich wrote:
> >>>>>> On 04.09.2024 03:43, Stefano Stabellini wrote:
> >>>>>>> On Tue, 3 Sep 2024, Jan Beulich wrote:
> >>>>>>>> On 03.09.2024 12:53, Chen, Jiqian wrote:
> >>>>>>>>> On 2024/9/3 17:25, Jan Beulich wrote:
> >>>>>>>>>> On 03.09.2024 09:58, Chen, Jiqian wrote:
> >>>>>>>>>>> On 2024/9/3 15:42, Jan Beulich wrote:
> >>>>>>>>>>>> On 03.09.2024 09:04, Jiqian Chen wrote:
> >>>>>>>>>>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
> >>>>>>>>>>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
> >>>>>>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
> >>>>>>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
> >>>>>>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
> >>>>>>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
> >>>>>>>>>>>>> codes.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> But it is fine to map interrupts through pirq to a HVM domain whose
> >>>>>>>>>>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
> >>>>>>>>>>>>> reference interrupts and it is just the way for the device model to
> >>>>>>>>>>>>> identify which interrupt should be mapped to which domain, however
> >>>>>>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
> >>>>>>>>>>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
> >>>>>>>>>>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
> >>>>>>>>>>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
> >>>>>>>>>>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
> >>>>>>>>>>>>
> >>>>>>>>>>>> As before: When you talk about just Dom0, ...
> >>>>>>>>>>>>
> >>>>>>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
> >>>>>>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
> >>>>>>>>>>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> >>>>>>>>>>>>>      {
> >>>>>>>>>>>>>      case PHYSDEVOP_map_pirq:
> >>>>>>>>>>>>>      case PHYSDEVOP_unmap_pirq:
> >>>>>>>>>>>>> +        break;
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>>      case PHYSDEVOP_eoi:
> >>>>>>>>>>>>>      case PHYSDEVOP_irq_status_query:
> >>>>>>>>>>>>>      case PHYSDEVOP_get_free_pirq:
> >>>>>>>>>>>>
> >>>>>>>>>>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
> >>>>>>>>>>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
> >>>>>>>>>>>> Similarly imo Dom0 using this on itself wants discussing.
> >>>>>>>>>>> Do you mean I need to talk about why permit this op for all HVM
> >>>>>>>>>>
> >>>>>>>>>> You don't need to invent reasons, but it needs making clear that wider than
> >>>>>>>>>> necessary (for your purpose) exposure is at least not going to be a problem.
> >>>>>>>>>>
> >>>>>>>>>>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
> >>>>>>>>>>
> >>>>>>>>>> Aiui use on itself is limited to Dom0, so only that would need clarifying
> >>>>>>>>>> (along the lines of the above, i.e. that/why it is not a problem). For
> >>>>>>>>>> has_pirq() domains use on oneself was already permitted before.
> >>>>>>>>>
> >>>>>>>>> Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
> >>>>>>>>> {
> >>>>>>>>> x86/pvh: Allow (un)map_pirq when dom0 is PVH
> >>>>>>>>>
> >>>>>>>>> Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
> >>>>>>>>> code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
> >>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
> >>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
> >>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
> >>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
> >>>>>>>>> codes.
> >>>>>>>>>
> >>>>>>>>> To solve above problem, need to remove the chack has_pirq() for that
> >>>>>>>>> situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
> >>>>>>>>> adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
> >>>>>>>>> what the problem need.
> >>>>>>>>> So, clarify below:
> >>>>>>>>>
> >>>>>>>>> For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
> >>>>>>>>> interrupts through pirq for them. Because pirq field is used as a way to
> >>>>>>>>> reference interrupts and it is just the way for the device model to
> >>>>>>>>> identify which interrupt should be mapped to which domain, however
> >>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
> >>>>>>>>> devices(emulated or passthrough) through event channel, so, remove
> >>>>>>>>> has_pirq() check has no impact on HVM domUs.
> >>>>>>>>>
> >>>>>>>>> For PVH domUs that performs such an operation will fail at the check
> >>>>>>>>> xsm_map_dedomain_pirq() in physdev_map-nirq().
> >>>>>>>>>
> >>>>>>>>> For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
> >>>>>>>>> it also has no impact.
> >>>>>>>>> }
> >>>>>>>>
> >>>>>>>> This is better than what you had before, and I don't really fancy re-
> >>>>>>>> writing the description effectively from scratch. So let's just go from
> >>>>>>>> there. As to the "excess" permission for the sub-ops: The main thing I'm
> >>>>>>>> after is that it be clarified that we're not going to introduce any
> >>>>>>>> security issues here. That requires auditing the code, and merely saying
> >>>>>>>> "also has no impact" is a little too little for my taste. For Dom0 an
> >>>>>>>> argument may be that it is overly powerful already anyway, even if for
> >>>>>>>> PVH we're a little more strict than for PV (I think).
> >>>>>>>
> >>>>>>> Hi Jan, for clarity and to make this actionable, are you suggesting to
> >>>>>>> clarify the commit message by adding wording around "Dom0 is overly
> >>>>>>> powerful already anyway so it is OK so this is OK" ?
> >>>>>>
> >>>>>> Yes, perhaps with "deemed" added. 
> >>>>> OK, for PVH dom0, I will change to " Dom0 is deemed overly powerful already anyway, so it is OK "
> >>>>
> >>>> I don't mind the deemed as you add it, but the important place to add it
> >>>> here is before "OK". I'm sorry, it didn't occur to me that after all the
> >>>> prior discussion this earlier reply of mine could still be mis-interpreted.
> >>>>
> >>>>>> And text for DomU-s similarly extended, as the pointing at the XSM check is presently incomplete (stubdom-s can
> >>>>>> pass that check, after all, as can de-priv qemu running in Dom0).
> >>>>> So sorry, I know so little about this, I can't explain these situations, could you tell me how to describe or help me write a paragraph?
> >>>>
> >>>> I'm afraid that in order to make (propose) such a change you need to be
> >>>> able to explain why it is okay to expose functionality beyond where it's
> >>>> presently exposed. It's not just writing a new paragraph that's needed
> >>>> here. You first need to _check_ that what you do is okay. And once you've
> >>>> done that checking, you then summarize that in writing.
> >>>  
> >>>
> >>> PHYSDEVOP_map_pirq ends up calling physdev_map_pirq which is protected
> >>> by:
> >>>
> >>>     ret = xsm_map_domain_pirq(XSM_DM_PRIV, d);
> >>>     if ( ret )
> >>>         return ret;
> >>>
> >>> Dom0 having permission to do PHYSDEVOP_map_pirq even without has_pirq is
> >>> fine. Device models are also OK because the code we are trying to enable
> >>> is in fact part of the device model. If someone were to run an HVM
> >>> stubdom they might need this patch as well.
> >>>
> >>> If PHYSDEVOP_map_pirq is allowed, also PHYSDEVOP_unmap_pirq should be
> >>> allowed.
> >>>
> >>> Is this explanation OK?
> >>
> >> This still solely focuses on why the functionality is wanted. There
> >> continues to be nothing about the wider exposure actually being safe.
> > 
> > I don't think I understand what you would like to be checked or
> > clarified...
> > 
> > The only wider exposure is to device models, and device models can do a
> > lot worse than mapping pirqs already. There is no wider exposure to
> > DomUs. Also PV device models can already do this.
> 
> What do you mean by "worse"? I hope not "crash Xen"? And _that's_ what I
> want to have assurance of, e.g. a PVH/HVM DM not suddenly bringing Xen
> down, because these paths previously weren't accessible to them.

What about a commit message along the lines of:

x86/hvm: allow {,un}map_pirq hypercalls unconditionally

The current hypercall interfaces to manage and assign interrupts to
domains is mostly based in using pIRQs as handlers.  Such pIRQ values
are abstract domain-specific references to interrupts.

Classic HVM domains can have access to {,un}map_pirq hypercalls if the
domain is allowed to route physical interrupts over event channels.
That's however a different interface, limited to only mapping
interrupts to itself. PVH domains on the other hand never had access
to the interface, as PVH domains are not allowed to route interrupts
over event channels.

In order to allow setting up PCI passthrough from a PVH domain it
needs access to the {,un}map_pirq hypercalls so interrupts can be
assigned a pIRQ handler that can then be used by further hypercalls to
bind the interrupt to a domain.

Note that the {,un}map_pirq hypercalls end up calling helpers that are
already used against a PVH domain in order to setup interrupts for the
hardware domain when running in PVH mode.  physdev_map_pirq() will
call allocate_and_map_{gsi,msi}_pirq() which is already used by the
vIO-APIC or the vPCI code respectively.  So the exposed code paths are
not new when targeting a PVH domain, but rather previous callers are
not hypercall but emulation based.

Regards, Roger.
Jan Beulich Sept. 9, 2024, 10:22 a.m. UTC | #15
On 09.09.2024 12:04, Roger Pau Monné wrote:
> On Mon, Sep 09, 2024 at 10:56:07AM +0200, Jan Beulich wrote:
>> On 07.09.2024 01:34, Stefano Stabellini wrote:
>>> On Fri, 6 Sep 2024, Jan Beulich wrote:
>>>> On 06.09.2024 00:51, Stefano Stabellini wrote:
>>>>> On Thu, 5 Sep 2024, Jan Beulich wrote:
>>>>>> On 05.09.2024 08:45, Chen, Jiqian wrote:
>>>>>>> HI,
>>>>>>>
>>>>>>> On 2024/9/4 14:04, Jan Beulich wrote:
>>>>>>>> On 04.09.2024 03:43, Stefano Stabellini wrote:
>>>>>>>>> On Tue, 3 Sep 2024, Jan Beulich wrote:
>>>>>>>>>> On 03.09.2024 12:53, Chen, Jiqian wrote:
>>>>>>>>>>> On 2024/9/3 17:25, Jan Beulich wrote:
>>>>>>>>>>>> On 03.09.2024 09:58, Chen, Jiqian wrote:
>>>>>>>>>>>>> On 2024/9/3 15:42, Jan Beulich wrote:
>>>>>>>>>>>>>> On 03.09.2024 09:04, Jiqian Chen wrote:
>>>>>>>>>>>>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
>>>>>>>>>>>>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>>>>>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>>>>>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>>>>>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>>>>>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>>>>>>>>>>>> codes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> But it is fine to map interrupts through pirq to a HVM domain whose
>>>>>>>>>>>>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
>>>>>>>>>>>>>>> reference interrupts and it is just the way for the device model to
>>>>>>>>>>>>>>> identify which interrupt should be mapped to which domain, however
>>>>>>>>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>>>>>>>>>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
>>>>>>>>>>>>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>>>>>>>>>>>>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
>>>>>>>>>>>>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As before: When you talk about just Dom0, ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>>>>>>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>>>>>>>>>>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>>>>>>>>>>      {
>>>>>>>>>>>>>>>      case PHYSDEVOP_map_pirq:
>>>>>>>>>>>>>>>      case PHYSDEVOP_unmap_pirq:
>>>>>>>>>>>>>>> +        break;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>      case PHYSDEVOP_eoi:
>>>>>>>>>>>>>>>      case PHYSDEVOP_irq_status_query:
>>>>>>>>>>>>>>>      case PHYSDEVOP_get_free_pirq:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
>>>>>>>>>>>>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
>>>>>>>>>>>>>> Similarly imo Dom0 using this on itself wants discussing.
>>>>>>>>>>>>> Do you mean I need to talk about why permit this op for all HVM
>>>>>>>>>>>>
>>>>>>>>>>>> You don't need to invent reasons, but it needs making clear that wider than
>>>>>>>>>>>> necessary (for your purpose) exposure is at least not going to be a problem.
>>>>>>>>>>>>
>>>>>>>>>>>>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
>>>>>>>>>>>>
>>>>>>>>>>>> Aiui use on itself is limited to Dom0, so only that would need clarifying
>>>>>>>>>>>> (along the lines of the above, i.e. that/why it is not a problem). For
>>>>>>>>>>>> has_pirq() domains use on oneself was already permitted before.
>>>>>>>>>>>
>>>>>>>>>>> Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
>>>>>>>>>>> {
>>>>>>>>>>> x86/pvh: Allow (un)map_pirq when dom0 is PVH
>>>>>>>>>>>
>>>>>>>>>>> Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
>>>>>>>>>>> code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>>>>>>>> codes.
>>>>>>>>>>>
>>>>>>>>>>> To solve above problem, need to remove the chack has_pirq() for that
>>>>>>>>>>> situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
>>>>>>>>>>> adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
>>>>>>>>>>> what the problem need.
>>>>>>>>>>> So, clarify below:
>>>>>>>>>>>
>>>>>>>>>>> For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
>>>>>>>>>>> interrupts through pirq for them. Because pirq field is used as a way to
>>>>>>>>>>> reference interrupts and it is just the way for the device model to
>>>>>>>>>>> identify which interrupt should be mapped to which domain, however
>>>>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>>>>>>>> devices(emulated or passthrough) through event channel, so, remove
>>>>>>>>>>> has_pirq() check has no impact on HVM domUs.
>>>>>>>>>>>
>>>>>>>>>>> For PVH domUs that performs such an operation will fail at the check
>>>>>>>>>>> xsm_map_dedomain_pirq() in physdev_map-nirq().
>>>>>>>>>>>
>>>>>>>>>>> For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
>>>>>>>>>>> it also has no impact.
>>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> This is better than what you had before, and I don't really fancy re-
>>>>>>>>>> writing the description effectively from scratch. So let's just go from
>>>>>>>>>> there. As to the "excess" permission for the sub-ops: The main thing I'm
>>>>>>>>>> after is that it be clarified that we're not going to introduce any
>>>>>>>>>> security issues here. That requires auditing the code, and merely saying
>>>>>>>>>> "also has no impact" is a little too little for my taste. For Dom0 an
>>>>>>>>>> argument may be that it is overly powerful already anyway, even if for
>>>>>>>>>> PVH we're a little more strict than for PV (I think).
>>>>>>>>>
>>>>>>>>> Hi Jan, for clarity and to make this actionable, are you suggesting to
>>>>>>>>> clarify the commit message by adding wording around "Dom0 is overly
>>>>>>>>> powerful already anyway so it is OK so this is OK" ?
>>>>>>>>
>>>>>>>> Yes, perhaps with "deemed" added. 
>>>>>>> OK, for PVH dom0, I will change to " Dom0 is deemed overly powerful already anyway, so it is OK "
>>>>>>
>>>>>> I don't mind the deemed as you add it, but the important place to add it
>>>>>> here is before "OK". I'm sorry, it didn't occur to me that after all the
>>>>>> prior discussion this earlier reply of mine could still be mis-interpreted.
>>>>>>
>>>>>>>> And text for DomU-s similarly extended, as the pointing at the XSM check is presently incomplete (stubdom-s can
>>>>>>>> pass that check, after all, as can de-priv qemu running in Dom0).
>>>>>>> So sorry, I know so little about this, I can't explain these situations, could you tell me how to describe or help me write a paragraph?
>>>>>>
>>>>>> I'm afraid that in order to make (propose) such a change you need to be
>>>>>> able to explain why it is okay to expose functionality beyond where it's
>>>>>> presently exposed. It's not just writing a new paragraph that's needed
>>>>>> here. You first need to _check_ that what you do is okay. And once you've
>>>>>> done that checking, you then summarize that in writing.
>>>>>  
>>>>>
>>>>> PHYSDEVOP_map_pirq ends up calling physdev_map_pirq which is protected
>>>>> by:
>>>>>
>>>>>     ret = xsm_map_domain_pirq(XSM_DM_PRIV, d);
>>>>>     if ( ret )
>>>>>         return ret;
>>>>>
>>>>> Dom0 having permission to do PHYSDEVOP_map_pirq even without has_pirq is
>>>>> fine. Device models are also OK because the code we are trying to enable
>>>>> is in fact part of the device model. If someone were to run an HVM
>>>>> stubdom they might need this patch as well.
>>>>>
>>>>> If PHYSDEVOP_map_pirq is allowed, also PHYSDEVOP_unmap_pirq should be
>>>>> allowed.
>>>>>
>>>>> Is this explanation OK?
>>>>
>>>> This still solely focuses on why the functionality is wanted. There
>>>> continues to be nothing about the wider exposure actually being safe.
>>>
>>> I don't think I understand what you would like to be checked or
>>> clarified...
>>>
>>> The only wider exposure is to device models, and device models can do a
>>> lot worse than mapping pirqs already. There is no wider exposure to
>>> DomUs. Also PV device models can already do this.
>>
>> What do you mean by "worse"? I hope not "crash Xen"? And _that's_ what I
>> want to have assurance of, e.g. a PVH/HVM DM not suddenly bringing Xen
>> down, because these paths previously weren't accessible to them.
> 
> What about a commit message along the lines of:
> 
> x86/hvm: allow {,un}map_pirq hypercalls unconditionally
> 
> The current hypercall interfaces to manage and assign interrupts to
> domains is mostly based in using pIRQs as handlers.  Such pIRQ values
> are abstract domain-specific references to interrupts.
> 
> Classic HVM domains can have access to {,un}map_pirq hypercalls if the
> domain is allowed to route physical interrupts over event channels.
> That's however a different interface, limited to only mapping
> interrupts to itself. PVH domains on the other hand never had access
> to the interface, as PVH domains are not allowed to route interrupts
> over event channels.
> 
> In order to allow setting up PCI passthrough from a PVH domain it
> needs access to the {,un}map_pirq hypercalls so interrupts can be
> assigned a pIRQ handler that can then be used by further hypercalls to
> bind the interrupt to a domain.
> 
> Note that the {,un}map_pirq hypercalls end up calling helpers that are
> already used against a PVH domain in order to setup interrupts for the
> hardware domain when running in PVH mode.  physdev_map_pirq() will
> call allocate_and_map_{gsi,msi}_pirq() which is already used by the
> vIO-APIC or the vPCI code respectively.  So the exposed code paths are
> not new when targeting a PVH domain, but rather previous callers are
> not hypercall but emulation based.

I think I'd be fine with that, thanks.

Jan
Chen, Jiqian Sept. 9, 2024, 10:35 a.m. UTC | #16
On 2024/9/9 18:04, Roger Pau Monné wrote:
> On Mon, Sep 09, 2024 at 10:56:07AM +0200, Jan Beulich wrote:
>> On 07.09.2024 01:34, Stefano Stabellini wrote:
>>> On Fri, 6 Sep 2024, Jan Beulich wrote:
>>>> On 06.09.2024 00:51, Stefano Stabellini wrote:
>>>>> On Thu, 5 Sep 2024, Jan Beulich wrote:
>>>>>> On 05.09.2024 08:45, Chen, Jiqian wrote:
>>>>>>> HI,
>>>>>>>
>>>>>>> On 2024/9/4 14:04, Jan Beulich wrote:
>>>>>>>> On 04.09.2024 03:43, Stefano Stabellini wrote:
>>>>>>>>> On Tue, 3 Sep 2024, Jan Beulich wrote:
>>>>>>>>>> On 03.09.2024 12:53, Chen, Jiqian wrote:
>>>>>>>>>>> On 2024/9/3 17:25, Jan Beulich wrote:
>>>>>>>>>>>> On 03.09.2024 09:58, Chen, Jiqian wrote:
>>>>>>>>>>>>> On 2024/9/3 15:42, Jan Beulich wrote:
>>>>>>>>>>>>>> On 03.09.2024 09:04, Jiqian Chen wrote:
>>>>>>>>>>>>>>> When dom0 is PVH type and passthrough a device to HVM domU, Qemu code
>>>>>>>>>>>>>>> xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>>>>>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>>>>>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>>>>>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>>>>>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>>>>>>>>>>>> codes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> But it is fine to map interrupts through pirq to a HVM domain whose
>>>>>>>>>>>>>>> XENFEAT_hvm_pirqs is not enabled. Because pirq field is used as a way to
>>>>>>>>>>>>>>> reference interrupts and it is just the way for the device model to
>>>>>>>>>>>>>>> identify which interrupt should be mapped to which domain, however
>>>>>>>>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>>>>>>>>>>>> devices(emulated or passthrough) through event channel, so, the has_pirq()
>>>>>>>>>>>>>>> check should not be applied to the PHYSDEVOP_map_pirq issued by dom0.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
>>>>>>>>>>>>>>> PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq. Then the
>>>>>>>>>>>>>>> interrupt of a passthrough device can be successfully mapped to pirq for domU.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As before: When you talk about just Dom0, ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --- a/xen/arch/x86/hvm/hypercall.c
>>>>>>>>>>>>>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>>>>>>>>>>>>>> @@ -73,6 +73,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>>>>>>>>>>      {
>>>>>>>>>>>>>>>      case PHYSDEVOP_map_pirq:
>>>>>>>>>>>>>>>      case PHYSDEVOP_unmap_pirq:
>>>>>>>>>>>>>>> +        break;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>      case PHYSDEVOP_eoi:
>>>>>>>>>>>>>>>      case PHYSDEVOP_irq_status_query:
>>>>>>>>>>>>>>>      case PHYSDEVOP_get_free_pirq:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ... that ought to match the code. IOW you've again lost why it is okay(ish)
>>>>>>>>>>>>>> (or even necessary) to also permit the op for non-Dom0 (e.g. a PVH stubdom).
>>>>>>>>>>>>>> Similarly imo Dom0 using this on itself wants discussing.
>>>>>>>>>>>>> Do you mean I need to talk about why permit this op for all HVM
>>>>>>>>>>>>
>>>>>>>>>>>> You don't need to invent reasons, but it needs making clear that wider than
>>>>>>>>>>>> necessary (for your purpose) exposure is at least not going to be a problem.
>>>>>>>>>>>>
>>>>>>>>>>>>> and  where can prevent PVH domain calling this for self-mapping, not only dom0?
>>>>>>>>>>>>
>>>>>>>>>>>> Aiui use on itself is limited to Dom0, so only that would need clarifying
>>>>>>>>>>>> (along the lines of the above, i.e. that/why it is not a problem). For
>>>>>>>>>>>> has_pirq() domains use on oneself was already permitted before.
>>>>>>>>>>>
>>>>>>>>>>> Changed commit message to below. Please check and that will be great helpful if you would show me how to modify it.
>>>>>>>>>>> {
>>>>>>>>>>> x86/pvh: Allow (un)map_pirq when dom0 is PVH
>>>>>>>>>>>
>>>>>>>>>>> Problem: when dom0 is PVH type and passthrough a device to HVM domU, Qemu
>>>>>>>>>>> code xen_pt_realize->xc_physdev_map_pirq and libxl code pci_add_dm_done->
>>>>>>>>>>> xc_physdev_map_pirq map a pirq for passthrough devices.
>>>>>>>>>>> In xc_physdev_map_pirq call stack, function hvm_physdev_op has a check
>>>>>>>>>>> has_pirq(currd), but currd is PVH dom0, PVH has no X86_EMU_USE_PIRQ flag,
>>>>>>>>>>> so it fails, PHYSDEVOP_map_pirq is not allowed for PVH dom0 in current
>>>>>>>>>>> codes.
>>>>>>>>>>>
>>>>>>>>>>> To solve above problem, need to remove the chack has_pirq() for that
>>>>>>>>>>> situation(PHYSDEVOP_map_pirq is issued by dom0 for domUs). But without
>>>>>>>>>>> adding other restrictions, PHYSDEVOP_map_pirq will be allowed wider than
>>>>>>>>>>> what the problem need.
>>>>>>>>>>> So, clarify below:
>>>>>>>>>>>
>>>>>>>>>>> For HVM domUs whose XENFEAT_hvm_pirqs is not enabled,it is fine to map
>>>>>>>>>>> interrupts through pirq for them. Because pirq field is used as a way to
>>>>>>>>>>> reference interrupts and it is just the way for the device model to
>>>>>>>>>>> identify which interrupt should be mapped to which domain, however
>>>>>>>>>>> has_pirq() is just to check if HVM domains route interrupts from
>>>>>>>>>>> devices(emulated or passthrough) through event channel, so, remove
>>>>>>>>>>> has_pirq() check has no impact on HVM domUs.
>>>>>>>>>>>
>>>>>>>>>>> For PVH domUs that performs such an operation will fail at the check
>>>>>>>>>>> xsm_map_dedomain_pirq() in physdev_map-nirq().
>>>>>>>>>>>
>>>>>>>>>>> For PVH dom0, it uses vpci and doesn't use event channel, as above talks,
>>>>>>>>>>> it also has no impact.
>>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> This is better than what you had before, and I don't really fancy re-
>>>>>>>>>> writing the description effectively from scratch. So let's just go from
>>>>>>>>>> there. As to the "excess" permission for the sub-ops: The main thing I'm
>>>>>>>>>> after is that it be clarified that we're not going to introduce any
>>>>>>>>>> security issues here. That requires auditing the code, and merely saying
>>>>>>>>>> "also has no impact" is a little too little for my taste. For Dom0 an
>>>>>>>>>> argument may be that it is overly powerful already anyway, even if for
>>>>>>>>>> PVH we're a little more strict than for PV (I think).
>>>>>>>>>
>>>>>>>>> Hi Jan, for clarity and to make this actionable, are you suggesting to
>>>>>>>>> clarify the commit message by adding wording around "Dom0 is overly
>>>>>>>>> powerful already anyway so it is OK so this is OK" ?
>>>>>>>>
>>>>>>>> Yes, perhaps with "deemed" added. 
>>>>>>> OK, for PVH dom0, I will change to " Dom0 is deemed overly powerful already anyway, so it is OK "
>>>>>>
>>>>>> I don't mind the deemed as you add it, but the important place to add it
>>>>>> here is before "OK". I'm sorry, it didn't occur to me that after all the
>>>>>> prior discussion this earlier reply of mine could still be mis-interpreted.
>>>>>>
>>>>>>>> And text for DomU-s similarly extended, as the pointing at the XSM check is presently incomplete (stubdom-s can
>>>>>>>> pass that check, after all, as can de-priv qemu running in Dom0).
>>>>>>> So sorry, I know so little about this, I can't explain these situations, could you tell me how to describe or help me write a paragraph?
>>>>>>
>>>>>> I'm afraid that in order to make (propose) such a change you need to be
>>>>>> able to explain why it is okay to expose functionality beyond where it's
>>>>>> presently exposed. It's not just writing a new paragraph that's needed
>>>>>> here. You first need to _check_ that what you do is okay. And once you've
>>>>>> done that checking, you then summarize that in writing.
>>>>>  
>>>>>
>>>>> PHYSDEVOP_map_pirq ends up calling physdev_map_pirq which is protected
>>>>> by:
>>>>>
>>>>>     ret = xsm_map_domain_pirq(XSM_DM_PRIV, d);
>>>>>     if ( ret )
>>>>>         return ret;
>>>>>
>>>>> Dom0 having permission to do PHYSDEVOP_map_pirq even without has_pirq is
>>>>> fine. Device models are also OK because the code we are trying to enable
>>>>> is in fact part of the device model. If someone were to run an HVM
>>>>> stubdom they might need this patch as well.
>>>>>
>>>>> If PHYSDEVOP_map_pirq is allowed, also PHYSDEVOP_unmap_pirq should be
>>>>> allowed.
>>>>>
>>>>> Is this explanation OK?
>>>>
>>>> This still solely focuses on why the functionality is wanted. There
>>>> continues to be nothing about the wider exposure actually being safe.
>>>
>>> I don't think I understand what you would like to be checked or
>>> clarified...
>>>
>>> The only wider exposure is to device models, and device models can do a
>>> lot worse than mapping pirqs already. There is no wider exposure to
>>> DomUs. Also PV device models can already do this.
>>
>> What do you mean by "worse"? I hope not "crash Xen"? And _that's_ what I
>> want to have assurance of, e.g. a PVH/HVM DM not suddenly bringing Xen
>> down, because these paths previously weren't accessible to them.
> 
> What about a commit message along the lines of:
> 
> x86/hvm: allow {,un}map_pirq hypercalls unconditionally
> 
> The current hypercall interfaces to manage and assign interrupts to
> domains is mostly based in using pIRQs as handlers.  Such pIRQ values
> are abstract domain-specific references to interrupts.
> 
> Classic HVM domains can have access to {,un}map_pirq hypercalls if the
> domain is allowed to route physical interrupts over event channels.
> That's however a different interface, limited to only mapping
> interrupts to itself. PVH domains on the other hand never had access
> to the interface, as PVH domains are not allowed to route interrupts
> over event channels.
> 
> In order to allow setting up PCI passthrough from a PVH domain it
> needs access to the {,un}map_pirq hypercalls so interrupts can be
> assigned a pIRQ handler that can then be used by further hypercalls to
> bind the interrupt to a domain.
> 
> Note that the {,un}map_pirq hypercalls end up calling helpers that are
> already used against a PVH domain in order to setup interrupts for the
> hardware domain when running in PVH mode.  physdev_map_pirq() will
> call allocate_and_map_{gsi,msi}_pirq() which is already used by the
> vIO-APIC or the vPCI code respectively.  So the exposed code paths are
> not new when targeting a PVH domain, but rather previous callers are
> not hypercall but emulation based.

Thank you three very very much for your help.
I will change to this in next version.

> 
> Regards, Roger.
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index f023f7879e24..81883c8d4f60 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -73,6 +73,8 @@  long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
     {
     case PHYSDEVOP_map_pirq:
     case PHYSDEVOP_unmap_pirq:
+        break;
+
     case PHYSDEVOP_eoi:
     case PHYSDEVOP_irq_status_query:
     case PHYSDEVOP_get_free_pirq: