[v5,1/7] s390: ap: kvm: add PQAP interception for AQIC

Message ID	1552493104-30510-2-git-send-email-pmorel@linux.ibm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> Gateway: Authorized Use Only! Violators will be prosecuted for <kvm@vger.kernel.org> from <pmorel@linux.ibm.com>; Wed, 13 Mar 2019 16:05:10 -0000 Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 13 Mar 2019 16:05:08 -0000 From: Pierre Morel <pmorel@linux.ibm.com> To: borntraeger@de.ibm.com Cc: alex.williamson@redhat.com, cohuck@redhat.com, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, kvm@vger.kernel.org, frankja@linux.ibm.com, akrowiak@linux.ibm.com, pasic@linux.ibm.com, david@redhat.com, schwidefsky@de.ibm.com, heiko.carstens@de.ibm.com, freude@linux.ibm.com, mimu@linux.ibm.com Subject: [PATCH v5 1/7] s390: ap: kvm: add PQAP interception for AQIC Date: Wed, 13 Mar 2019 17:04:58 +0100 In-Reply-To: <1552493104-30510-1-git-send-email-pmorel@linux.ibm.com> References: <1552493104-30510-1-git-send-email-pmorel@linux.ibm.com> Message-Id: <1552493104-30510-2-git-send-email-pmorel@linux.ibm.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk
Series	vfio: ap: AP Queue Interrupt Control \| expand [v5,0/7] vfio: ap: AP Queue Interrupt Control [v5,1/7] s390: ap: kvm: add PQAP interception for AQIC [v5,2/7] s390: ap: new vfio_ap_queue structure [v5,3/7] vfio: ap: register IOMMU VFIO notifier [v5,4/7] s390: ap: setup relation betwen KVM and mediated device [v5,5/7] s390: ap: implement PAPQ AQIC interception in kernel [v5,6/7] s390: ap: Cleanup on removing the AP device [v5,7/7] s390: ap: kvm: Enable PQAP/AQIC facility for the guest

Pierre Morel March 13, 2019, 4:04 p.m. UTC

We prepare the interception of the PQAP/AQIC instruction for
the case the AQIC facility is enabled in the guest.

We add a callback inside the KVM arch structure for s390 for
a VFIO driver to handle a specific response to the PQAP
instruction with the AQIC command and only this command.
The preceding behavior for other commands should not change.

We inject the correct exceptions from inside KVM for the case the
callback is not initialized, which happens when the vfio_ap driver
is not loaded.

It is the duty of the vfio_driver to setup a pqap callback inside
the crypto structure.
If the callback has been setup we call it.
If not we setup an answer considering that no queue is available
for the guest when no callback has been setup.

We do consider the responsability of the driver to always initialize
the PQAP callback if it defines queues by initializing the CRYCB for
a guest.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 arch/s390/include/asm/kvm_host.h      |  8 +++++
 arch/s390/kvm/priv.c                  | 62 +++++++++++++++++++++++++++++++++++
 drivers/s390/crypto/vfio_ap_private.h |  2 ++
 3 files changed, 72 insertions(+)

Cornelia Huck March 15, 2019, 10:20 a.m. UTC | #1

On Wed, 13 Mar 2019 17:04:58 +0100
Pierre Morel <pmorel@linux.ibm.com> wrote:

> +/*
> + * handle_pqap: Handling pqap interception
> + * @vcpu: the vcpu having issue the pqap instruction
> + *
> + * We now support PQAP/AQIC instructions and we need to correctly
> + * answer the guest even if no dedicated driver's hook is available.
> + *
> + * The intercepting code calls a dedicated callback for this instruction
> + * if a driver did register one in the CRYPTO satellite of the
> + * SIE block.
> + *
> + * For PQAP/AQIC instructions only, verify privilege and specifications.
> + *
> + * If no callback available, the queues are not available, return this to
> + * the caller.
> + * Else return the value returned by the callback.
> + */
> +static int handle_pqap(struct kvm_vcpu *vcpu)
> +{
> +	uint8_t fc;
> +	struct ap_queue_status status = {};
> +	int ret;
> +	/* Verify that the AP instruction are available */
> +	if (!ap_instructions_available())
> +		return -EOPNOTSUPP;
> +	/* Verify that the guest is allowed to use AP instructions */
> +	if (!(vcpu->arch.sie_block->eca & ECA_APIE))
> +		return -EOPNOTSUPP;
> +	/* Verify that the function code is AQIC */
> +	fc = vcpu->run->s.regs.gprs[0] >> 24;
> +	/* We do not want to change the behavior we had before this patch*/
> +	if (fc != 0x03)
> +		return -EOPNOTSUPP;
> +
> +	/* PQAP instructions are allowed for guest kernel only */
> +	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
> +		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
> +	/* AQIC instruction is allowed only if facility 65 is available */
> +	if (!test_kvm_facility(vcpu->kvm, 65))
> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
> +	/* Verify that the hook callback is registered and call it */
> +	if (vcpu->kvm->arch.crypto.pqap_hook) {
> +		if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
> +			return -EOPNOTSUPP;
> +		ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
> +		module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
> +		return ret;
> +	}
> +	/*
> +	 * It is the duty of the vfio_driver to register a hook
> +	 * If it does not and we get an exception on AQIC we must
> +	 * guess that there is no vfio_ap_driver at all and no one
> +	 * to handle the guests's CRYCB and the CRYCB is empty.
> +	 */
> +	status.response_code = 0x01;

I'm still confused here, sorry. From previous discussions I recall that
this indicates "no crypto device" (please correct me if I'm wrong.)

Before this patch, we had:
- guest issues PQAP/AQIC -> drop to userspace

With a correct implementation, we get:
- guest issues PQAP/AQIC -> callback does what needs to be done

With an incorrect implementation (no callback), we get:
- guest issues PQAP/AQIC -> guest gets response code 0x01

Why not drop to userspace in that case?


> +	memcpy(&vcpu->run->s.regs.gprs[1], &status, sizeof(status));
> +	return 0;
> +}
> +

Pierre Morel March 15, 2019, 1:26 p.m. UTC | #2

On 15/03/2019 11:20, Cornelia Huck wrote:
> On Wed, 13 Mar 2019 17:04:58 +0100
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> +/*
>> + * handle_pqap: Handling pqap interception
>> + * @vcpu: the vcpu having issue the pqap instruction
>> + *
>> + * We now support PQAP/AQIC instructions and we need to correctly
>> + * answer the guest even if no dedicated driver's hook is available.
>> + *
>> + * The intercepting code calls a dedicated callback for this instruction
>> + * if a driver did register one in the CRYPTO satellite of the
>> + * SIE block.
>> + *
>> + * For PQAP/AQIC instructions only, verify privilege and specifications.
>> + *
>> + * If no callback available, the queues are not available, return this to
>> + * the caller.
>> + * Else return the value returned by the callback.
>> + */
>> +static int handle_pqap(struct kvm_vcpu *vcpu)
>> +{
>> +	uint8_t fc;
>> +	struct ap_queue_status status = {};
>> +	int ret;
>> +	/* Verify that the AP instruction are available */
>> +	if (!ap_instructions_available())
>> +		return -EOPNOTSUPP;
>> +	/* Verify that the guest is allowed to use AP instructions */
>> +	if (!(vcpu->arch.sie_block->eca & ECA_APIE))
>> +		return -EOPNOTSUPP;
>> +	/* Verify that the function code is AQIC */
>> +	fc = vcpu->run->s.regs.gprs[0] >> 24;
>> +	/* We do not want to change the behavior we had before this patch*/
>> +	if (fc != 0x03)
>> +		return -EOPNOTSUPP;
>> +
>> +	/* PQAP instructions are allowed for guest kernel only */
>> +	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>> +		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>> +	/* AQIC instruction is allowed only if facility 65 is available */
>> +	if (!test_kvm_facility(vcpu->kvm, 65))
>> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>> +	/* Verify that the hook callback is registered and call it */
>> +	if (vcpu->kvm->arch.crypto.pqap_hook) {
>> +		if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
>> +			return -EOPNOTSUPP;
>> +		ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
>> +		module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
>> +		return ret;
>> +	}
>> +	/*
>> +	 * It is the duty of the vfio_driver to register a hook
>> +	 * If it does not and we get an exception on AQIC we must
>> +	 * guess that there is no vfio_ap_driver at all and no one
>> +	 * to handle the guests's CRYCB and the CRYCB is empty.
>> +	 */
>> +	status.response_code = 0x01;
> 
> I'm still confused here, sorry. From previous discussions I recall that
> this indicates "no crypto device" (please correct me if I'm wrong.)
> 
> Before this patch, we had:
> - guest issues PQAP/AQIC -> drop to userspace
> 
> With a correct implementation, we get:
> - guest issues PQAP/AQIC -> callback does what needs to be done
> 
> With an incorrect implementation (no callback), we get:
> - guest issues PQAP/AQIC -> guest gets response code 0x01
> 
> Why not drop to userspace in that case?

This is what I had in the previous patches.
Hum, I do not remember which discussion lead me to modify this.

Anyway, now that you put the finger on this problem, I think the problem 
is worse.

The behavior with old / new Linux, vfio driver and qemu is:

LINUX	VFIO_AP	QEMU	PGM
OLD	x	x	OPERATION
NEW	-	OLD	SPECIFICATION
NEW	-	NEW/aqic=off	SPECIFICATION
NEW	x	NEW/aqic=on	-

x = whatever
- = absent/none

So yes there is a change in behavior for the userland for the case QEMU 
do not set the AQIC facility 65, OLD QEMU or NEW QEMU wanting to behave 
like an older one.

I fear we have the same problem with the privileged operation...

For the last case, when the kvm_facility(65) is set, the explication is 
the following:

This is related to the handling of PQAP AQIC which is now authorized by 
this patch series.
If we authorize PQAP AQIC, by setting the bit for facility 65, the guest 
can use this instruction.
If the instruction follows the specifications we must answer something 
realistic and since there is nothing in the CRYCB (no driver) we answer 
that there is no queue.

Conclusion:  we must handle this in userland, it will have the benefit 
to keep old behavior when there is no callback.
OLD QEMU will not see change as they will not set aqic facility
NEW QEMU will handle this correctly.

In this case we also do not need to handle all other tests here but can 
move it to the callback as Tony wanted.

Would you agree with something simple like:

static int handle_pqap(struct kvm_vcpu *vcpu)
{
         int ret = -EOPNOTSUPP;

         /* Verify that the hook callback is registered and call it */
         if (pqap_hook)
                 if (try_module_get(pqap_hook->owner)) {
                         ret = pqap_hook->hook(vcpu);
                         module_put(pqap_hook->owner);
                 }
         return ret;
}

All other tests in QEMU and in the callback.

Thanks for the comments.

Regards,
Pierre

Cornelia Huck March 15, 2019, 1:41 p.m. UTC | #3

On Fri, 15 Mar 2019 14:26:34 +0100
Pierre Morel <pmorel@linux.ibm.com> wrote:

> Conclusion:  we must handle this in userland, it will have the benefit 
> to keep old behavior when there is no callback.
> OLD QEMU will not see change as they will not set aqic facility
> NEW QEMU will handle this correctly.
> 
> In this case we also do not need to handle all other tests here but can 
> move it to the callback as Tony wanted.
> 
> Would you agree with something simple like:
> 
> static int handle_pqap(struct kvm_vcpu *vcpu)
> {
>          int ret = -EOPNOTSUPP;
> 
>          /* Verify that the hook callback is registered and call it */
>          if (pqap_hook)
>                  if (try_module_get(pqap_hook->owner)) {
>                          ret = pqap_hook->hook(vcpu);
>                          module_put(pqap_hook->owner);
>                  }
>          return ret;
> }
> 
> All other tests in QEMU and in the callback.

With the hook checking for priv, fc, etc.? Yeah, might work.

But don't count on my feedback too much right now, better wait for
others' comments :) I'll resume in April, if needed.

Pierre Morel March 15, 2019, 1:44 p.m. UTC | #4

On 15/03/2019 14:41, Cornelia Huck wrote:
> On Fri, 15 Mar 2019 14:26:34 +0100
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> Conclusion:  we must handle this in userland, it will have the benefit
>> to keep old behavior when there is no callback.
>> OLD QEMU will not see change as they will not set aqic facility
>> NEW QEMU will handle this correctly.
>>
>> In this case we also do not need to handle all other tests here but can
>> move it to the callback as Tony wanted.
>>
>> Would you agree with something simple like:
>>
>> static int handle_pqap(struct kvm_vcpu *vcpu)
>> {
>>           int ret = -EOPNOTSUPP;
>>
>>           /* Verify that the hook callback is registered and call it */
>>           if (pqap_hook)
>>                   if (try_module_get(pqap_hook->owner)) {
>>                           ret = pqap_hook->hook(vcpu);
>>                           module_put(pqap_hook->owner);
>>                   }
>>           return ret;
>> }
>>
>> All other tests in QEMU and in the callback.
> 
> With the hook checking for priv, fc, etc.? Yeah, might work.
> 
> But don't count on my feedback too much right now, better wait for
> others' comments :) I'll resume in April, if needed.
> 

OK, thanks

Pierre

Pierre Morel March 15, 2019, 2:10 p.m. UTC | #5

On 15/03/2019 14:26, Pierre Morel wrote:
> On 15/03/2019 11:20, Cornelia Huck wrote:
>> On Wed, 13 Mar 2019 17:04:58 +0100
>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>
>>> +/*
>>> + * handle_pqap: Handling pqap interception
>>> + * @vcpu: the vcpu having issue the pqap instruction
>>> + *
>>> + * We now support PQAP/AQIC instructions and we need to correctly
>>> + * answer the guest even if no dedicated driver's hook is available.
>>> + *
>>> + * The intercepting code calls a dedicated callback for this 
>>> instruction
>>> + * if a driver did register one in the CRYPTO satellite of the
>>> + * SIE block.
>>> + *
>>> + * For PQAP/AQIC instructions only, verify privilege and 
>>> specifications.
>>> + *
>>> + * If no callback available, the queues are not available, return 
>>> this to
>>> + * the caller.
>>> + * Else return the value returned by the callback.
>>> + */
>>> +static int handle_pqap(struct kvm_vcpu *vcpu)
>>> +{
>>> +    uint8_t fc;
>>> +    struct ap_queue_status status = {};
>>> +    int ret;
>>> +    /* Verify that the AP instruction are available */
>>> +    if (!ap_instructions_available())
>>> +        return -EOPNOTSUPP;
>>> +    /* Verify that the guest is allowed to use AP instructions */
>>> +    if (!(vcpu->arch.sie_block->eca & ECA_APIE))
>>> +        return -EOPNOTSUPP;
>>> +    /* Verify that the function code is AQIC */
>>> +    fc = vcpu->run->s.regs.gprs[0] >> 24;
>>> +    /* We do not want to change the behavior we had before this patch*/
>>> +    if (fc != 0x03)
>>> +        return -EOPNOTSUPP;
>>> +
>>> +    /* PQAP instructions are allowed for guest kernel only */
>>> +    if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>>> +        return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>>> +    /* AQIC instruction is allowed only if facility 65 is available */
>>> +    if (!test_kvm_facility(vcpu->kvm, 65))
>>> +        return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>>> +    /* Verify that the hook callback is registered and call it */
>>> +    if (vcpu->kvm->arch.crypto.pqap_hook) {
>>> +        if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
>>> +            return -EOPNOTSUPP;
>>> +        ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
>>> +        module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
>>> +        return ret;
>>> +    }
>>> +    /*
>>> +     * It is the duty of the vfio_driver to register a hook
>>> +     * If it does not and we get an exception on AQIC we must
>>> +     * guess that there is no vfio_ap_driver at all and no one
>>> +     * to handle the guests's CRYCB and the CRYCB is empty.
>>> +     */
>>> +    status.response_code = 0x01;
>>
>> I'm still confused here, sorry. From previous discussions I recall that
>> this indicates "no crypto device" (please correct me if I'm wrong.)
>>
>> Before this patch, we had:
>> - guest issues PQAP/AQIC -> drop to userspace
>>
>> With a correct implementation, we get:
>> - guest issues PQAP/AQIC -> callback does what needs to be done
>>
>> With an incorrect implementation (no callback), we get:
>> - guest issues PQAP/AQIC -> guest gets response code 0x01
>>
>> Why not drop to userspace in that case?
> 
> This is what I had in the previous patches.
> Hum, I do not remember which discussion lead me to modify this.
> 
> Anyway, now that you put the finger on this problem, I think the problem 
> is worse.
> 
> The behavior with old / new Linux, vfio driver and qemu is:
> 
> LINUX    VFIO_AP    QEMU    PGM
> OLD    x    x    OPERATION
> NEW    -    OLD    SPECIFICATION
> NEW    -    NEW/aqic=off    SPECIFICATION
> NEW    x    NEW/aqic=on    -
> 
> x = whatever
> - = absent/none
> 
> So yes there is a change in behavior for the userland for the case QEMU 
> do not set the AQIC facility 65, OLD QEMU or NEW QEMU wanting to behave 
> like an older one.
> 
> I fear we have the same problem with the privileged operation...
> 
> For the last case, when the kvm_facility(65) is set, the explication is 
> the following:
> 
> This is related to the handling of PQAP AQIC which is now authorized by 
> this patch series.
> If we authorize PQAP AQIC, by setting the bit for facility 65, the guest 
> can use this instruction.
> If the instruction follows the specifications we must answer something 
> realistic and since there is nothing in the CRYCB (no driver) we answer 
> that there is no queue.
> 
> Conclusion:  we must handle this in userland, it will have the benefit 
> to keep old behavior when there is no callback.
> OLD QEMU will not see change as they will not set aqic facility
> NEW QEMU will handle this correctly.
> 

Sorry, wrong conclusion, handling this in userland will bring us much 
too far if we want to answer correctly for the case the hook is not 
there but QEMU accepted the facility for AQIC.

The alternative is easier, we just continue to respond with the 
OPERATION exception here and only handle the specification and 
privileged exception cases in QEMU and in the hook.

So, I think the discussion will go on until you come back :)

Regards,
Pierre

Halil Pasic March 15, 2019, 5:28 p.m. UTC | #6

On Fri, 15 Mar 2019 14:26:34 +0100
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 15/03/2019 11:20, Cornelia Huck wrote:
> > On Wed, 13 Mar 2019 17:04:58 +0100
> > Pierre Morel <pmorel@linux.ibm.com> wrote:
> > 
> >> +/*
> >> + * handle_pqap: Handling pqap interception
> >> + * @vcpu: the vcpu having issue the pqap instruction
> >> + *
> >> + * We now support PQAP/AQIC instructions and we need to correctly
> >> + * answer the guest even if no dedicated driver's hook is available.
> >> + *
> >> + * The intercepting code calls a dedicated callback for this instruction
> >> + * if a driver did register one in the CRYPTO satellite of the
> >> + * SIE block.
> >> + *
> >> + * For PQAP/AQIC instructions only, verify privilege and specifications.
> >> + *
> >> + * If no callback available, the queues are not available, return this to
> >> + * the caller.
> >> + * Else return the value returned by the callback.
> >> + */
> >> +static int handle_pqap(struct kvm_vcpu *vcpu)
> >> +{
> >> +	uint8_t fc;
> >> +	struct ap_queue_status status = {};
> >> +	int ret;
> >> +	/* Verify that the AP instruction are available */
> >> +	if (!ap_instructions_available())
> >> +		return -EOPNOTSUPP;
> >> +	/* Verify that the guest is allowed to use AP instructions */
> >> +	if (!(vcpu->arch.sie_block->eca & ECA_APIE))
> >> +		return -EOPNOTSUPP;
> >> +	/* Verify that the function code is AQIC */
> >> +	fc = vcpu->run->s.regs.gprs[0] >> 24;
> >> +	/* We do not want to change the behavior we had before this patch*/
> >> +	if (fc != 0x03)
> >> +		return -EOPNOTSUPP;
> >> +
> >> +	/* PQAP instructions are allowed for guest kernel only */
> >> +	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
> >> +		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
> >> +	/* AQIC instruction is allowed only if facility 65 is available */
> >> +	if (!test_kvm_facility(vcpu->kvm, 65))
> >> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
> >> +	/* Verify that the hook callback is registered and call it */
> >> +	if (vcpu->kvm->arch.crypto.pqap_hook) {
> >> +		if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
> >> +			return -EOPNOTSUPP;
> >> +		ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
> >> +		module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
> >> +		return ret;
> >> +	}
> >> +	/*
> >> +	 * It is the duty of the vfio_driver to register a hook
> >> +	 * If it does not and we get an exception on AQIC we must
> >> +	 * guess that there is no vfio_ap_driver at all and no one
> >> +	 * to handle the guests's CRYCB and the CRYCB is empty.
> >> +	 */
> >> +	status.response_code = 0x01;
> > 
> > I'm still confused here, sorry. From previous discussions I recall that
> > this indicates "no crypto device" (please correct me if I'm wrong.)
> > 
> > Before this patch, we had:
> > - guest issues PQAP/AQIC -> drop to userspace
> > 
> > With a correct implementation, we get:
> > - guest issues PQAP/AQIC -> callback does what needs to be done
> > 
> > With an incorrect implementation (no callback), we get:
> > - guest issues PQAP/AQIC -> guest gets response code 0x01
> > 
> > Why not drop to userspace in that case?
> 
> This is what I had in the previous patches.
> Hum, I do not remember which discussion lead me to modify this.
> 
> Anyway, now that you put the finger on this problem, I think the problem 
> is worse.
> 
> The behavior with old / new Linux, vfio driver and qemu is:
> 
> LINUX	VFIO_AP	QEMU	PGM
> OLD	x	x	OPERATION

Isn't OPERATION a bad answer if ap=on? It should not happen
with a well behaved guest because facility 65 is not indicated,
but if it does, I guess we give the wrong answer.

> NEW	-	OLD	SPECIFICATION
> NEW	-	NEW/aqic=off	SPECIFICATION
> NEW	x	NEW/aqic=on	-
> 

AFAICT with LINUX == NEW we get the correct answer. OPERATION exception
is only good if ap=off.

> x = whatever
> - = absent/none
> 
> So yes there is a change in behavior for the userland for the case QEMU 
> do not set the AQIC facility 65, OLD QEMU or NEW QEMU wanting to behave 
> like an older one.
> 
> I fear we have the same problem with the privileged operation...
> 

IMHO this boils down to:
* either OLD QEMU or 
* OLD LINUX
should have taken care of handling the mandatory intercept for PQAP/AQIC
if ap=on (i.e. guest has AP instructions), and does not have facility 65
which was the case for OLD.

Things get complicated when one considers that ECA.28 is an effective
control.

> For the last case, when the kvm_facility(65) is set, the explication is 
> the following:
> 
> This is related to the handling of PQAP AQIC which is now authorized by 
> this patch series.
> If we authorize PQAP AQIC, by setting the bit for facility 65, the guest 
> can use this instruction.
> If the instruction follows the specifications we must answer something 
> realistic and since there is nothing in the CRYCB (no driver) we answer 
> that there is no queue.
> 
> Conclusion:  we must handle this in userland, it will have the benefit 
> to keep old behavior when there is no callback.
> OLD QEMU will not see change as they will not set aqic facility

That would mean we remain quirky.

> NEW QEMU will handle this correctly.
> 
> In this case we also do not need to handle all other tests here but can 
> move it to the callback as Tony wanted.
> 
> Would you agree with something simple like:
> 
> static int handle_pqap(struct kvm_vcpu *vcpu)
> {
>          int ret = -EOPNOTSUPP;
> 
>          /* Verify that the hook callback is registered and call it */
>          if (pqap_hook)
>                  if (try_module_get(pqap_hook->owner)) {
>                          ret = pqap_hook->hook(vcpu);
>                          module_put(pqap_hook->owner);
>                  }
>          return ret;
> }
> 
> All other tests in QEMU and in the callback.
> 

You stated in another email that the conclusion is wrong. I'm not sure
what is the cleanest solution here. This effective control thing does
make my head spin.

Regards,
Halil

Halil Pasic March 15, 2019, 5:43 p.m. UTC | #7

On Fri, 15 Mar 2019 15:10:25 +0100
Pierre Morel <pmorel@linux.ibm.com> wrote:

> Sorry, wrong conclusion, handling this in userland will bring us much 
> too far if we want to answer correctly for the case the hook is not 
> there but QEMU accepted the facility for AQIC.
> 
> The alternative is easier, we just continue to respond with the 
> OPERATION exception here and only handle the specification and 
> privileged exception cases in QEMU and in the hook.

I don't quite understand what do you mean by this paragraph. Especially
not what do you mean by 'just continue to respond with the OPERATION
exception here'.

In any case if the guest is supposed to have ap instructions, and does
not have facility 65 the right answer is specification and not operation
exception. And this has to work regardless of vfio-ap module loaded or
not.

Regards,
Halil

> 
> So, I think the discussion will go on until you come back :)
> 
> Regards,
> Pierre

Pierre Morel March 19, 2019, 9:55 a.m. UTC | #8

On 15/03/2019 15:10, Pierre Morel wrote:
> On 15/03/2019 14:26, Pierre Morel wrote:
>> On 15/03/2019 11:20, Cornelia Huck wrote:
>>> On Wed, 13 Mar 2019 17:04:58 +0100
>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>
>>>> +/*
>>>> + * handle_pqap: Handling pqap interception
>>>> + * @vcpu: the vcpu having issue the pqap instruction
>>>> + *
>>>> + * We now support PQAP/AQIC instructions and we need to correctly
>>>> + * answer the guest even if no dedicated driver's hook is available.
>>>> + *
>>>> + * The intercepting code calls a dedicated callback for this 
>>>> instruction
>>>> + * if a driver did register one in the CRYPTO satellite of the
>>>> + * SIE block.
>>>> + *
>>>> + * For PQAP/AQIC instructions only, verify privilege and 
>>>> specifications.
>>>> + *
>>>> + * If no callback available, the queues are not available, return 
>>>> this to
>>>> + * the caller.
>>>> + * Else return the value returned by the callback.
>>>> + */
>>>> +static int handle_pqap(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +    uint8_t fc;
>>>> +    struct ap_queue_status status = {};
>>>> +    int ret;
>>>> +    /* Verify that the AP instruction are available */
>>>> +    if (!ap_instructions_available())
>>>> +        return -EOPNOTSUPP;
>>>> +    /* Verify that the guest is allowed to use AP instructions */
>>>> +    if (!(vcpu->arch.sie_block->eca & ECA_APIE))
>>>> +        return -EOPNOTSUPP;
>>>> +    /* Verify that the function code is AQIC */
>>>> +    fc = vcpu->run->s.regs.gprs[0] >> 24;
>>>> +    /* We do not want to change the behavior we had before this 
>>>> patch*/
>>>> +    if (fc != 0x03)
>>>> +        return -EOPNOTSUPP;
>>>> +
>>>> +    /* PQAP instructions are allowed for guest kernel only */
>>>> +    if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>>>> +        return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>>>> +    /* AQIC instruction is allowed only if facility 65 is available */
>>>> +    if (!test_kvm_facility(vcpu->kvm, 65))
>>>> +        return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>>>> +    /* Verify that the hook callback is registered and call it */
>>>> +    if (vcpu->kvm->arch.crypto.pqap_hook) {
>>>> +        if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
>>>> +            return -EOPNOTSUPP;
>>>> +        ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
>>>> +        module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
>>>> +        return ret;
>>>> +    }
>>>> +    /*
>>>> +     * It is the duty of the vfio_driver to register a hook
>>>> +     * If it does not and we get an exception on AQIC we must
>>>> +     * guess that there is no vfio_ap_driver at all and no one
>>>> +     * to handle the guests's CRYCB and the CRYCB is empty.
>>>> +     */
>>>> +    status.response_code = 0x01;
>>>
>>> I'm still confused here, sorry. From previous discussions I recall that
>>> this indicates "no crypto device" (please correct me if I'm wrong.)
>>>
>>> Before this patch, we had:
>>> - guest issues PQAP/AQIC -> drop to userspace
>>>
>>> With a correct implementation, we get:
>>> - guest issues PQAP/AQIC -> callback does what needs to be done
>>>
>>> With an incorrect implementation (no callback), we get:
>>> - guest issues PQAP/AQIC -> guest gets response code 0x01
>>>
>>> Why not drop to userspace in that case?
>>
>> This is what I had in the previous patches.
>> Hum, I do not remember which discussion lead me to modify this.
>>
>> Anyway, now that you put the finger on this problem, I think the 
>> problem is worse.
>>
>> The behavior with old / new Linux, vfio driver and qemu is:
>>
>> LINUX    VFIO_AP    QEMU    PGM
>> OLD    x    x    OPERATION
>> NEW    -    OLD    SPECIFICATION
>> NEW    -    NEW/aqic=off    SPECIFICATION
>> NEW    x    NEW/aqic=on    -
>>
>> x = whatever
>> - = absent/none
>>
>> So yes there is a change in behavior for the userland for the case 
>> QEMU do not set the AQIC facility 65, OLD QEMU or NEW QEMU wanting to 
>> behave like an older one.
>>
>> I fear we have the same problem with the privileged operation...
>>
>> For the last case, when the kvm_facility(65) is set, the explication 
>> is the following:
>>
>> This is related to the handling of PQAP AQIC which is now authorized 
>> by this patch series.
>> If we authorize PQAP AQIC, by setting the bit for facility 65, the 
>> guest can use this instruction.
>> If the instruction follows the specifications we must answer something 
>> realistic and since there is nothing in the CRYCB (no driver) we 
>> answer that there is no queue.
>>
>> Conclusion:  we must handle this in userland, it will have the benefit 
>> to keep old behavior when there is no callback.
>> OLD QEMU will not see change as they will not set aqic facility
>> NEW QEMU will handle this correctly.
>>
> 
> Sorry, wrong conclusion, handling this in userland will bring us much 
> too far if we want to answer correctly for the case the hook is not 
> there but QEMU accepted the facility for AQIC.

Sorry, forget it, I was tired.

Pierre

> 
> The alternative is easier, we just continue to respond with the 
> OPERATION exception here and only handle the specification and 
> privileged exception cases in QEMU and in the hook.
> 
> So, I think the discussion will go on until you come back :)
> 
> Regards,
> Pierre
>

Pierre Morel March 19, 2019, 10:01 a.m. UTC | #9

On 15/03/2019 18:28, Halil Pasic wrote:
> On Fri, 15 Mar 2019 14:26:34 +0100
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 15/03/2019 11:20, Cornelia Huck wrote:
>>> On Wed, 13 Mar 2019 17:04:58 +0100
>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>
>>>> +/*
>>>> + * handle_pqap: Handling pqap interception
>>>> + * @vcpu: the vcpu having issue the pqap instruction
>>>> + *
>>>> + * We now support PQAP/AQIC instructions and we need to correctly
>>>> + * answer the guest even if no dedicated driver's hook is available.
>>>> + *
>>>> + * The intercepting code calls a dedicated callback for this instruction
>>>> + * if a driver did register one in the CRYPTO satellite of the
>>>> + * SIE block.
>>>> + *
>>>> + * For PQAP/AQIC instructions only, verify privilege and specifications.
>>>> + *
>>>> + * If no callback available, the queues are not available, return this to
>>>> + * the caller.
>>>> + * Else return the value returned by the callback.
>>>> + */
>>>> +static int handle_pqap(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +	uint8_t fc;
>>>> +	struct ap_queue_status status = {};
>>>> +	int ret;
>>>> +	/* Verify that the AP instruction are available */
>>>> +	if (!ap_instructions_available())
>>>> +		return -EOPNOTSUPP;
>>>> +	/* Verify that the guest is allowed to use AP instructions */
>>>> +	if (!(vcpu->arch.sie_block->eca & ECA_APIE))
>>>> +		return -EOPNOTSUPP;
>>>> +	/* Verify that the function code is AQIC */
>>>> +	fc = vcpu->run->s.regs.gprs[0] >> 24;
>>>> +	/* We do not want to change the behavior we had before this patch*/
>>>> +	if (fc != 0x03)
>>>> +		return -EOPNOTSUPP;
>>>> +
>>>> +	/* PQAP instructions are allowed for guest kernel only */
>>>> +	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>>>> +		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>>>> +	/* AQIC instruction is allowed only if facility 65 is available */
>>>> +	if (!test_kvm_facility(vcpu->kvm, 65))
>>>> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
>>>> +	/* Verify that the hook callback is registered and call it */
>>>> +	if (vcpu->kvm->arch.crypto.pqap_hook) {
>>>> +		if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
>>>> +			return -EOPNOTSUPP;
>>>> +		ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
>>>> +		module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
>>>> +		return ret;
>>>> +	}
>>>> +	/*
>>>> +	 * It is the duty of the vfio_driver to register a hook
>>>> +	 * If it does not and we get an exception on AQIC we must
>>>> +	 * guess that there is no vfio_ap_driver at all and no one
>>>> +	 * to handle the guests's CRYCB and the CRYCB is empty.
>>>> +	 */
>>>> +	status.response_code = 0x01;
>>>
>>> I'm still confused here, sorry. From previous discussions I recall that
>>> this indicates "no crypto device" (please correct me if I'm wrong.)
>>>
>>> Before this patch, we had:
>>> - guest issues PQAP/AQIC -> drop to userspace
>>>
>>> With a correct implementation, we get:
>>> - guest issues PQAP/AQIC -> callback does what needs to be done
>>>
>>> With an incorrect implementation (no callback), we get:
>>> - guest issues PQAP/AQIC -> guest gets response code 0x01
>>>
>>> Why not drop to userspace in that case?
>>
>> This is what I had in the previous patches.
>> Hum, I do not remember which discussion lead me to modify this.
>>
>> Anyway, now that you put the finger on this problem, I think the problem
>> is worse.
>>
>> The behavior with old / new Linux, vfio driver and qemu is:
>>
>> LINUX	VFIO_AP	QEMU	PGM
>> OLD	x	x	OPERATION
> 
> Isn't OPERATION a bad answer if ap=on? It should not happen
> with a well behaved guest because facility 65 is not indicated,
> but if it does, I guess we give the wrong answer.

It is clearly wrong but we can not change the past :)

> 
>> NEW	-	OLD	SPECIFICATION
>> NEW	-	NEW/aqic=off	SPECIFICATION
>> NEW	x	NEW/aqic=on	-
>>
> 
> AFAICT with LINUX == NEW we get the correct answer. OPERATION exception
> is only good if ap=off.

Exact.

> 
>> x = whatever
>> - = absent/none
>>
>> So yes there is a change in behavior for the userland for the case QEMU
>> do not set the AQIC facility 65, OLD QEMU or NEW QEMU wanting to behave
>> like an older one.
>>
>> I fear we have the same problem with the privileged operation...
>>
> 
> IMHO this boils down to:
> * either OLD QEMU or
> * OLD LINUX
> should have taken care of handling the mandatory intercept for PQAP/AQIC
> if ap=on (i.e. guest has AP instructions), and does not have facility 65
> which was the case for OLD.

yes

> 
> Things get complicated when one considers that ECA.28 is an effective
> control.

I don't think so, ECA_28 is not really a problem.
We do not propagate ECA_AIV in VSIE and ECA_AIV is tested in the vfio 
driver to support GISA.
So that the guest 3 will not support interrupt.

> 
>> For the last case, when the kvm_facility(65) is set, the explication is
>> the following:
>>
>> This is related to the handling of PQAP AQIC which is now authorized by
>> this patch series.
>> If we authorize PQAP AQIC, by setting the bit for facility 65, the guest
>> can use this instruction.
>> If the instruction follows the specifications we must answer something
>> realistic and since there is nothing in the CRYCB (no driver) we answer
>> that there is no queue.
>>
>> Conclusion:  we must handle this in userland, it will have the benefit
>> to keep old behavior when there is no callback.
>> OLD QEMU will not see change as they will not set aqic facility
> 
> That would mean we remain quirky.

Yes, the alternative is:

1) We do things right but this mean we change the ABI (SPECIFICATION 
instead of OPERATION)

I thing this is the best thing to do, it is the implementation proposed 
by this patch where all is done in Kernel, so that we are right what 
ever the userland user is (QEMU or other).

2) We want to preserve the old ABI for old QEMU
Then I proposed the implementation here under.


My personal opinion, is that we should change the ABI and do things 
right now.
We should also do it right for TAPQ with t bit set. I remember Christian 
already warned about this but we did not implement it.

> 
>> NEW QEMU will handle this correctly.
>>
>> In this case we also do not need to handle all other tests here but can
>> move it to the callback as Tony wanted.
>>
>> Would you agree with something simple like:
>>
>> static int handle_pqap(struct kvm_vcpu *vcpu)
>> {
>>           int ret = -EOPNOTSUPP;
>>
>>           /* Verify that the hook callback is registered and call it */
>>           if (pqap_hook)
>>                   if (try_module_get(pqap_hook->owner)) {
>>                           ret = pqap_hook->hook(vcpu);
>>                           module_put(pqap_hook->owner);
>>                   }
>>           return ret;
>> }
>>
>> All other tests in QEMU and in the callback.
>>
> 
> You stated in another email that the conclusion is wrong. I'm not sure

Forget it, I do not understand what I wanted to say there /o\ .

> what is the cleanest solution here. This effective control thing does
> make my head spin.

As I said I do not thing any effective control do interfere here.
IMHO Alternative 1 is the cleanest solution

Regards,
Pierre

Halil Pasic March 19, 2019, 2:54 p.m. UTC | #10

On Tue, 19 Mar 2019 11:01:44 +0100
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 15/03/2019 18:28, Halil Pasic wrote:

[..]

> > 
> > Things get complicated when one considers that ECA.28 is an effective
> > control.
> 
> I don't think so, ECA_28 is not really a problem.
> We do not propagate ECA_AIV in VSIE and ECA_AIV is tested in the vfio 
> driver to support GISA.
> So that the guest 3 will not support interrupt.
> 

That was not my concern, but while we are at it... I guess you refer to
the check in handle_pqap(). That seems to do -EOPNOTSUPP, i.e. got to
userspace, i.e. with today's QEMU operation exception. Which does not
seem right.

My concern was the following. Let assume 
ECA.28 == 1 and EECA.28 == 0 != 1
and guest issues a PQAP (for simplicity AQIC).

Currently I guess we take a 0x04 interception and go to userspace, which
may or may not be the best thing to do.

With this patch we would take a 0x04, but (opposed to before) if guest
does not have facility 65 we go with a specification exception.
Operation exception should however take priority over this kind of
specification exception. So basically everything except PQAP/AQIC would
give you and operation exception (with current QEMU), but PQAP/AQIC would
give you a specification exception. Which is wrong!

AFAICT there is no way to tell if we got a 04 interception because
EECA.28 != 1 (and ECA.28 == 1) and FW won't interpret the AP
instructions for us, or because it PQAP/AQIC is a mandatory intercept.
In other words I don't see a way to tell if EECA.28 is 1 when
interpreting PQAP/AQIC.

Do you agree?

[..]

> 
> Yes, the alternative is:
> 
> 1) We do things right but this mean we change the ABI (SPECIFICATION 
> instead of OPERATION)
> 
> I thing this is the best thing to do, it is the implementation
> proposed by this patch where all is done in Kernel, so that we are
> right what ever the userland user is (QEMU or other).
> 
> 2) We want to preserve the old ABI for old QEMU
> Then I proposed the implementation here under.
> 
> 
> My personal opinion, is that we should change the ABI and do things 
> right now.

I tend to agree. Giving an operation exception instead of a specification
exception is a bug. If it is a kernel or qemu bug it ain't clear to me
at the moment. 

> We should also do it right for TAPQ with t bit set. I remember
> Christian already warned about this but we did not implement it.
> 

Yes, I have some blurry memories of something similar myself. I wonder
if there was a reason, or did we just forget to address this issue.

Regards,
Halil

Pierre Morel March 19, 2019, 5:07 p.m. UTC | #11

On 19/03/2019 15:54, Halil Pasic wrote:
> On Tue, 19 Mar 2019 11:01:44 +0100
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 15/03/2019 18:28, Halil Pasic wrote:
> 
> [..]
> 
>>>
>>> Things get complicated when one considers that ECA.28 is an effective
>>> control.
>>
>> I don't think so, ECA_28 is not really a problem.
>> We do not propagate ECA_AIV in VSIE and ECA_AIV is tested in the vfio
>> driver to support GISA.
>> So that the guest 3 will not support interrupt.
>>
> 
> That was not my concern, but while we are at it... I guess you refer to
> the check in handle_pqap(). That seems to do -EOPNOTSUPP, i.e. got to
> userspace, i.e. with today's QEMU operation exception. Which does not
> seem right.

We already discussed this. no?

> 
> My concern was the following. Let assume
> ECA.28 == 1 and EECA.28 == 0 != 1
> and guest issues a PQAP (for simplicity AQIC).
> 
> Currently I guess we take a 0x04 interception and go to userspace, which
> may or may not be the best thing to do.
> 
> With this patch we would take a 0x04, but (opposed to before) if guest
> does not have facility 65 we go with a specification exception.

This is not right.
We return -EOPNOTSUPP which will be intercepted by QEMU which will 
report an OPERATION exception as before.

> Operation exception should however take priority over this kind of
> specification exception. So basically everything except PQAP/AQIC would
> give you and operation exception (with current QEMU), but PQAP/AQIC would
> give you a specification exception. Which is wrong!
> 
> AFAICT there is no way to tell if we got a 04 interception because
> EECA.28 != 1 (and ECA.28 == 1) and FW won't interpret the AP
> instructions for us, or because it PQAP/AQIC is a mandatory intercept.
> In other words I don't see a way to tell if EECA.28 is 1 when
> interpreting PQAP/AQIC.
> 
> Do you agree?


No.
EECA = HOST_ECA & GUEST_ECA
after we made sure that AP instructions are available, HOST_ECA=1

(vcpu->arch.sie_block->eca & ECA_APIE) gives us the answer.

In the case HOST_ECA=0 we always go to userland as before.

> 
> [..]
> 
>>
>> Yes, the alternative is:
>>
>> 1) We do things right but this mean we change the ABI (SPECIFICATION
>> instead of OPERATION)
>>
>> I thing this is the best thing to do, it is the implementation
>> proposed by this patch where all is done in Kernel, so that we are
>> right what ever the userland user is (QEMU or other).
>>
>> 2) We want to preserve the old ABI for old QEMU
>> Then I proposed the implementation here under.
>>
>>
>> My personal opinion, is that we should change the ABI and do things
>> right now.
> 
> I tend to agree. Giving an operation exception instead of a specification
> exception is a bug. If it is a kernel or qemu bug it ain't clear to me
> at the moment.
> 
>> We should also do it right for TAPQ with t bit set. I remember
>> Christian already warned about this but we did not implement it.
>>
> 
> Yes, I have some blurry memories of something similar myself. I wonder
> if there was a reason, or did we just forget to address this issue.


I will integrate it in the next iteration too, even it is not IRQ, the 
PQAP hook patch can be more general.

Regards,
Pierre

> 
> Regards,
> Halil
>

Pierre Morel March 21, 2019, 2:05 p.m. UTC | #12

On 19/03/2019 18:07, Pierre Morel wrote:
> On 19/03/2019 15:54, Halil Pasic wrote:
>> On Tue, 19 Mar 2019 11:01:44 +0100
>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>
>>> On 15/03/2019 18:28, Halil Pasic wrote:
>>

...snip...

>>
>>> We should also do it right for TAPQ with t bit set. I remember
>>> Christian already warned about this but we did not implement it.
>>>
>>
>> Yes, I have some blurry memories of something similar myself. I wonder
>> if there was a reason, or did we just forget to address this issue.
> 
> 
> I will integrate it in the next iteration too, even it is not IRQ, the 
> PQAP hook patch can be more general.

After all, I will not do this, I remember the reason why we did not do 
it once: simply it is not intercepted until we enable it.

So we will handle it when we enable the TAPQ-t interception

Regards,
Pierre

[v5,1/7] s390: ap: kvm: add PQAP interception for AQIC

Commit Message

Comments

Patch