Message ID | 1552493104-30510-2-git-send-email-pmorel@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfio: ap: AP Queue Interrupt Control | expand |
On Wed, 13 Mar 2019 17:04:58 +0100 Pierre Morel <pmorel@linux.ibm.com> wrote: > +/* > + * handle_pqap: Handling pqap interception > + * @vcpu: the vcpu having issue the pqap instruction > + * > + * We now support PQAP/AQIC instructions and we need to correctly > + * answer the guest even if no dedicated driver's hook is available. > + * > + * The intercepting code calls a dedicated callback for this instruction > + * if a driver did register one in the CRYPTO satellite of the > + * SIE block. > + * > + * For PQAP/AQIC instructions only, verify privilege and specifications. > + * > + * If no callback available, the queues are not available, return this to > + * the caller. > + * Else return the value returned by the callback. > + */ > +static int handle_pqap(struct kvm_vcpu *vcpu) > +{ > + uint8_t fc; > + struct ap_queue_status status = {}; > + int ret; > + /* Verify that the AP instruction are available */ > + if (!ap_instructions_available()) > + return -EOPNOTSUPP; > + /* Verify that the guest is allowed to use AP instructions */ > + if (!(vcpu->arch.sie_block->eca & ECA_APIE)) > + return -EOPNOTSUPP; > + /* Verify that the function code is AQIC */ > + fc = vcpu->run->s.regs.gprs[0] >> 24; > + /* We do not want to change the behavior we had before this patch*/ > + if (fc != 0x03) > + return -EOPNOTSUPP; > + > + /* PQAP instructions are allowed for guest kernel only */ > + if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE) > + return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP); > + /* AQIC instruction is allowed only if facility 65 is available */ > + if (!test_kvm_facility(vcpu->kvm, 65)) > + return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); > + /* Verify that the hook callback is registered and call it */ > + if (vcpu->kvm->arch.crypto.pqap_hook) { > + if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner)) > + return -EOPNOTSUPP; > + ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu); > + module_put(vcpu->kvm->arch.crypto.pqap_hook->owner); > + return ret; > + } > + /* > + * It is the duty of the vfio_driver to register a hook > + * If it does not and we get an exception on AQIC we must > + * guess that there is no vfio_ap_driver at all and no one > + * to handle the guests's CRYCB and the CRYCB is empty. > + */ > + status.response_code = 0x01; I'm still confused here, sorry. From previous discussions I recall that this indicates "no crypto device" (please correct me if I'm wrong.) Before this patch, we had: - guest issues PQAP/AQIC -> drop to userspace With a correct implementation, we get: - guest issues PQAP/AQIC -> callback does what needs to be done With an incorrect implementation (no callback), we get: - guest issues PQAP/AQIC -> guest gets response code 0x01 Why not drop to userspace in that case? > + memcpy(&vcpu->run->s.regs.gprs[1], &status, sizeof(status)); > + return 0; > +} > +
On 15/03/2019 11:20, Cornelia Huck wrote: > On Wed, 13 Mar 2019 17:04:58 +0100 > Pierre Morel <pmorel@linux.ibm.com> wrote: > >> +/* >> + * handle_pqap: Handling pqap interception >> + * @vcpu: the vcpu having issue the pqap instruction >> + * >> + * We now support PQAP/AQIC instructions and we need to correctly >> + * answer the guest even if no dedicated driver's hook is available. >> + * >> + * The intercepting code calls a dedicated callback for this instruction >> + * if a driver did register one in the CRYPTO satellite of the >> + * SIE block. >> + * >> + * For PQAP/AQIC instructions only, verify privilege and specifications. >> + * >> + * If no callback available, the queues are not available, return this to >> + * the caller. >> + * Else return the value returned by the callback. >> + */ >> +static int handle_pqap(struct kvm_vcpu *vcpu) >> +{ >> + uint8_t fc; >> + struct ap_queue_status status = {}; >> + int ret; >> + /* Verify that the AP instruction are available */ >> + if (!ap_instructions_available()) >> + return -EOPNOTSUPP; >> + /* Verify that the guest is allowed to use AP instructions */ >> + if (!(vcpu->arch.sie_block->eca & ECA_APIE)) >> + return -EOPNOTSUPP; >> + /* Verify that the function code is AQIC */ >> + fc = vcpu->run->s.regs.gprs[0] >> 24; >> + /* We do not want to change the behavior we had before this patch*/ >> + if (fc != 0x03) >> + return -EOPNOTSUPP; >> + >> + /* PQAP instructions are allowed for guest kernel only */ >> + if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE) >> + return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP); >> + /* AQIC instruction is allowed only if facility 65 is available */ >> + if (!test_kvm_facility(vcpu->kvm, 65)) >> + return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); >> + /* Verify that the hook callback is registered and call it */ >> + if (vcpu->kvm->arch.crypto.pqap_hook) { >> + if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner)) >> + return -EOPNOTSUPP; >> + ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu); >> + module_put(vcpu->kvm->arch.crypto.pqap_hook->owner); >> + return ret; >> + } >> + /* >> + * It is the duty of the vfio_driver to register a hook >> + * If it does not and we get an exception on AQIC we must >> + * guess that there is no vfio_ap_driver at all and no one >> + * to handle the guests's CRYCB and the CRYCB is empty. >> + */ >> + status.response_code = 0x01; > > I'm still confused here, sorry. From previous discussions I recall that > this indicates "no crypto device" (please correct me if I'm wrong.) > > Before this patch, we had: > - guest issues PQAP/AQIC -> drop to userspace > > With a correct implementation, we get: > - guest issues PQAP/AQIC -> callback does what needs to be done > > With an incorrect implementation (no callback), we get: > - guest issues PQAP/AQIC -> guest gets response code 0x01 > > Why not drop to userspace in that case? This is what I had in the previous patches. Hum, I do not remember which discussion lead me to modify this. Anyway, now that you put the finger on this problem, I think the problem is worse. The behavior with old / new Linux, vfio driver and qemu is: LINUX VFIO_AP QEMU PGM OLD x x OPERATION NEW - OLD SPECIFICATION NEW - NEW/aqic=off SPECIFICATION NEW x NEW/aqic=on - x = whatever - = absent/none So yes there is a change in behavior for the userland for the case QEMU do not set the AQIC facility 65, OLD QEMU or NEW QEMU wanting to behave like an older one. I fear we have the same problem with the privileged operation... For the last case, when the kvm_facility(65) is set, the explication is the following: This is related to the handling of PQAP AQIC which is now authorized by this patch series. If we authorize PQAP AQIC, by setting the bit for facility 65, the guest can use this instruction. If the instruction follows the specifications we must answer something realistic and since there is nothing in the CRYCB (no driver) we answer that there is no queue. Conclusion: we must handle this in userland, it will have the benefit to keep old behavior when there is no callback. OLD QEMU will not see change as they will not set aqic facility NEW QEMU will handle this correctly. In this case we also do not need to handle all other tests here but can move it to the callback as Tony wanted. Would you agree with something simple like: static int handle_pqap(struct kvm_vcpu *vcpu) { int ret = -EOPNOTSUPP; /* Verify that the hook callback is registered and call it */ if (pqap_hook) if (try_module_get(pqap_hook->owner)) { ret = pqap_hook->hook(vcpu); module_put(pqap_hook->owner); } return ret; } All other tests in QEMU and in the callback. Thanks for the comments. Regards, Pierre
On Fri, 15 Mar 2019 14:26:34 +0100 Pierre Morel <pmorel@linux.ibm.com> wrote: > Conclusion: we must handle this in userland, it will have the benefit > to keep old behavior when there is no callback. > OLD QEMU will not see change as they will not set aqic facility > NEW QEMU will handle this correctly. > > In this case we also do not need to handle all other tests here but can > move it to the callback as Tony wanted. > > Would you agree with something simple like: > > static int handle_pqap(struct kvm_vcpu *vcpu) > { > int ret = -EOPNOTSUPP; > > /* Verify that the hook callback is registered and call it */ > if (pqap_hook) > if (try_module_get(pqap_hook->owner)) { > ret = pqap_hook->hook(vcpu); > module_put(pqap_hook->owner); > } > return ret; > } > > All other tests in QEMU and in the callback. With the hook checking for priv, fc, etc.? Yeah, might work. But don't count on my feedback too much right now, better wait for others' comments :) I'll resume in April, if needed.
On 15/03/2019 14:41, Cornelia Huck wrote: > On Fri, 15 Mar 2019 14:26:34 +0100 > Pierre Morel <pmorel@linux.ibm.com> wrote: > >> Conclusion: we must handle this in userland, it will have the benefit >> to keep old behavior when there is no callback. >> OLD QEMU will not see change as they will not set aqic facility >> NEW QEMU will handle this correctly. >> >> In this case we also do not need to handle all other tests here but can >> move it to the callback as Tony wanted. >> >> Would you agree with something simple like: >> >> static int handle_pqap(struct kvm_vcpu *vcpu) >> { >> int ret = -EOPNOTSUPP; >> >> /* Verify that the hook callback is registered and call it */ >> if (pqap_hook) >> if (try_module_get(pqap_hook->owner)) { >> ret = pqap_hook->hook(vcpu); >> module_put(pqap_hook->owner); >> } >> return ret; >> } >> >> All other tests in QEMU and in the callback. > > With the hook checking for priv, fc, etc.? Yeah, might work. > > But don't count on my feedback too much right now, better wait for > others' comments :) I'll resume in April, if needed. > OK, thanks Pierre
On 15/03/2019 14:26, Pierre Morel wrote: > On 15/03/2019 11:20, Cornelia Huck wrote: >> On Wed, 13 Mar 2019 17:04:58 +0100 >> Pierre Morel <pmorel@linux.ibm.com> wrote: >> >>> +/* >>> + * handle_pqap: Handling pqap interception >>> + * @vcpu: the vcpu having issue the pqap instruction >>> + * >>> + * We now support PQAP/AQIC instructions and we need to correctly >>> + * answer the guest even if no dedicated driver's hook is available. >>> + * >>> + * The intercepting code calls a dedicated callback for this >>> instruction >>> + * if a driver did register one in the CRYPTO satellite of the >>> + * SIE block. >>> + * >>> + * For PQAP/AQIC instructions only, verify privilege and >>> specifications. >>> + * >>> + * If no callback available, the queues are not available, return >>> this to >>> + * the caller. >>> + * Else return the value returned by the callback. >>> + */ >>> +static int handle_pqap(struct kvm_vcpu *vcpu) >>> +{ >>> + uint8_t fc; >>> + struct ap_queue_status status = {}; >>> + int ret; >>> + /* Verify that the AP instruction are available */ >>> + if (!ap_instructions_available()) >>> + return -EOPNOTSUPP; >>> + /* Verify that the guest is allowed to use AP instructions */ >>> + if (!(vcpu->arch.sie_block->eca & ECA_APIE)) >>> + return -EOPNOTSUPP; >>> + /* Verify that the function code is AQIC */ >>> + fc = vcpu->run->s.regs.gprs[0] >> 24; >>> + /* We do not want to change the behavior we had before this patch*/ >>> + if (fc != 0x03) >>> + return -EOPNOTSUPP; >>> + >>> + /* PQAP instructions are allowed for guest kernel only */ >>> + if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE) >>> + return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP); >>> + /* AQIC instruction is allowed only if facility 65 is available */ >>> + if (!test_kvm_facility(vcpu->kvm, 65)) >>> + return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); >>> + /* Verify that the hook callback is registered and call it */ >>> + if (vcpu->kvm->arch.crypto.pqap_hook) { >>> + if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner)) >>> + return -EOPNOTSUPP; >>> + ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu); >>> + module_put(vcpu->kvm->arch.crypto.pqap_hook->owner); >>> + return ret; >>> + } >>> + /* >>> + * It is the duty of the vfio_driver to register a hook >>> + * If it does not and we get an exception on AQIC we must >>> + * guess that there is no vfio_ap_driver at all and no one >>> + * to handle the guests's CRYCB and the CRYCB is empty. >>> + */ >>> + status.response_code = 0x01; >> >> I'm still confused here, sorry. From previous discussions I recall that >> this indicates "no crypto device" (please correct me if I'm wrong.) >> >> Before this patch, we had: >> - guest issues PQAP/AQIC -> drop to userspace >> >> With a correct implementation, we get: >> - guest issues PQAP/AQIC -> callback does what needs to be done >> >> With an incorrect implementation (no callback), we get: >> - guest issues PQAP/AQIC -> guest gets response code 0x01 >> >> Why not drop to userspace in that case? > > This is what I had in the previous patches. > Hum, I do not remember which discussion lead me to modify this. > > Anyway, now that you put the finger on this problem, I think the problem > is worse. > > The behavior with old / new Linux, vfio driver and qemu is: > > LINUX VFIO_AP QEMU PGM > OLD x x OPERATION > NEW - OLD SPECIFICATION > NEW - NEW/aqic=off SPECIFICATION > NEW x NEW/aqic=on - > > x = whatever > - = absent/none > > So yes there is a change in behavior for the userland for the case QEMU > do not set the AQIC facility 65, OLD QEMU or NEW QEMU wanting to behave > like an older one. > > I fear we have the same problem with the privileged operation... > > For the last case, when the kvm_facility(65) is set, the explication is > the following: > > This is related to the handling of PQAP AQIC which is now authorized by > this patch series. > If we authorize PQAP AQIC, by setting the bit for facility 65, the guest > can use this instruction. > If the instruction follows the specifications we must answer something > realistic and since there is nothing in the CRYCB (no driver) we answer > that there is no queue. > > Conclusion: we must handle this in userland, it will have the benefit > to keep old behavior when there is no callback. > OLD QEMU will not see change as they will not set aqic facility > NEW QEMU will handle this correctly. > Sorry, wrong conclusion, handling this in userland will bring us much too far if we want to answer correctly for the case the hook is not there but QEMU accepted the facility for AQIC. The alternative is easier, we just continue to respond with the OPERATION exception here and only handle the specification and privileged exception cases in QEMU and in the hook. So, I think the discussion will go on until you come back :) Regards, Pierre
On Fri, 15 Mar 2019 14:26:34 +0100 Pierre Morel <pmorel@linux.ibm.com> wrote: > On 15/03/2019 11:20, Cornelia Huck wrote: > > On Wed, 13 Mar 2019 17:04:58 +0100 > > Pierre Morel <pmorel@linux.ibm.com> wrote: > > > >> +/* > >> + * handle_pqap: Handling pqap interception > >> + * @vcpu: the vcpu having issue the pqap instruction > >> + * > >> + * We now support PQAP/AQIC instructions and we need to correctly > >> + * answer the guest even if no dedicated driver's hook is available. > >> + * > >> + * The intercepting code calls a dedicated callback for this instruction > >> + * if a driver did register one in the CRYPTO satellite of the > >> + * SIE block. > >> + * > >> + * For PQAP/AQIC instructions only, verify privilege and specifications. > >> + * > >> + * If no callback available, the queues are not available, return this to > >> + * the caller. > >> + * Else return the value returned by the callback. > >> + */ > >> +static int handle_pqap(struct kvm_vcpu *vcpu) > >> +{ > >> + uint8_t fc; > >> + struct ap_queue_status status = {}; > >> + int ret; > >> + /* Verify that the AP instruction are available */ > >> + if (!ap_instructions_available()) > >> + return -EOPNOTSUPP; > >> + /* Verify that the guest is allowed to use AP instructions */ > >> + if (!(vcpu->arch.sie_block->eca & ECA_APIE)) > >> + return -EOPNOTSUPP; > >> + /* Verify that the function code is AQIC */ > >> + fc = vcpu->run->s.regs.gprs[0] >> 24; > >> + /* We do not want to change the behavior we had before this patch*/ > >> + if (fc != 0x03) > >> + return -EOPNOTSUPP; > >> + > >> + /* PQAP instructions are allowed for guest kernel only */ > >> + if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE) > >> + return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP); > >> + /* AQIC instruction is allowed only if facility 65 is available */ > >> + if (!test_kvm_facility(vcpu->kvm, 65)) > >> + return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); > >> + /* Verify that the hook callback is registered and call it */ > >> + if (vcpu->kvm->arch.crypto.pqap_hook) { > >> + if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner)) > >> + return -EOPNOTSUPP; > >> + ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu); > >> + module_put(vcpu->kvm->arch.crypto.pqap_hook->owner); > >> + return ret; > >> + } > >> + /* > >> + * It is the duty of the vfio_driver to register a hook > >> + * If it does not and we get an exception on AQIC we must > >> + * guess that there is no vfio_ap_driver at all and no one > >> + * to handle the guests's CRYCB and the CRYCB is empty. > >> + */ > >> + status.response_code = 0x01; > > > > I'm still confused here, sorry. From previous discussions I recall that > > this indicates "no crypto device" (please correct me if I'm wrong.) > > > > Before this patch, we had: > > - guest issues PQAP/AQIC -> drop to userspace > > > > With a correct implementation, we get: > > - guest issues PQAP/AQIC -> callback does what needs to be done > > > > With an incorrect implementation (no callback), we get: > > - guest issues PQAP/AQIC -> guest gets response code 0x01 > > > > Why not drop to userspace in that case? > > This is what I had in the previous patches. > Hum, I do not remember which discussion lead me to modify this. > > Anyway, now that you put the finger on this problem, I think the problem > is worse. > > The behavior with old / new Linux, vfio driver and qemu is: > > LINUX VFIO_AP QEMU PGM > OLD x x OPERATION Isn't OPERATION a bad answer if ap=on? It should not happen with a well behaved guest because facility 65 is not indicated, but if it does, I guess we give the wrong answer. > NEW - OLD SPECIFICATION > NEW - NEW/aqic=off SPECIFICATION > NEW x NEW/aqic=on - > AFAICT with LINUX == NEW we get the correct answer. OPERATION exception is only good if ap=off. > x = whatever > - = absent/none > > So yes there is a change in behavior for the userland for the case QEMU > do not set the AQIC facility 65, OLD QEMU or NEW QEMU wanting to behave > like an older one. > > I fear we have the same problem with the privileged operation... > IMHO this boils down to: * either OLD QEMU or * OLD LINUX should have taken care of handling the mandatory intercept for PQAP/AQIC if ap=on (i.e. guest has AP instructions), and does not have facility 65 which was the case for OLD. Things get complicated when one considers that ECA.28 is an effective control. > For the last case, when the kvm_facility(65) is set, the explication is > the following: > > This is related to the handling of PQAP AQIC which is now authorized by > this patch series. > If we authorize PQAP AQIC, by setting the bit for facility 65, the guest > can use this instruction. > If the instruction follows the specifications we must answer something > realistic and since there is nothing in the CRYCB (no driver) we answer > that there is no queue. > > Conclusion: we must handle this in userland, it will have the benefit > to keep old behavior when there is no callback. > OLD QEMU will not see change as they will not set aqic facility That would mean we remain quirky. > NEW QEMU will handle this correctly. > > In this case we also do not need to handle all other tests here but can > move it to the callback as Tony wanted. > > Would you agree with something simple like: > > static int handle_pqap(struct kvm_vcpu *vcpu) > { > int ret = -EOPNOTSUPP; > > /* Verify that the hook callback is registered and call it */ > if (pqap_hook) > if (try_module_get(pqap_hook->owner)) { > ret = pqap_hook->hook(vcpu); > module_put(pqap_hook->owner); > } > return ret; > } > > All other tests in QEMU and in the callback. > You stated in another email that the conclusion is wrong. I'm not sure what is the cleanest solution here. This effective control thing does make my head spin. Regards, Halil
On Fri, 15 Mar 2019 15:10:25 +0100 Pierre Morel <pmorel@linux.ibm.com> wrote: > Sorry, wrong conclusion, handling this in userland will bring us much > too far if we want to answer correctly for the case the hook is not > there but QEMU accepted the facility for AQIC. > > The alternative is easier, we just continue to respond with the > OPERATION exception here and only handle the specification and > privileged exception cases in QEMU and in the hook. I don't quite understand what do you mean by this paragraph. Especially not what do you mean by 'just continue to respond with the OPERATION exception here'. In any case if the guest is supposed to have ap instructions, and does not have facility 65 the right answer is specification and not operation exception. And this has to work regardless of vfio-ap module loaded or not. Regards, Halil > > So, I think the discussion will go on until you come back :) > > Regards, > Pierre
On 15/03/2019 15:10, Pierre Morel wrote: > On 15/03/2019 14:26, Pierre Morel wrote: >> On 15/03/2019 11:20, Cornelia Huck wrote: >>> On Wed, 13 Mar 2019 17:04:58 +0100 >>> Pierre Morel <pmorel@linux.ibm.com> wrote: >>> >>>> +/* >>>> + * handle_pqap: Handling pqap interception >>>> + * @vcpu: the vcpu having issue the pqap instruction >>>> + * >>>> + * We now support PQAP/AQIC instructions and we need to correctly >>>> + * answer the guest even if no dedicated driver's hook is available. >>>> + * >>>> + * The intercepting code calls a dedicated callback for this >>>> instruction >>>> + * if a driver did register one in the CRYPTO satellite of the >>>> + * SIE block. >>>> + * >>>> + * For PQAP/AQIC instructions only, verify privilege and >>>> specifications. >>>> + * >>>> + * If no callback available, the queues are not available, return >>>> this to >>>> + * the caller. >>>> + * Else return the value returned by the callback. >>>> + */ >>>> +static int handle_pqap(struct kvm_vcpu *vcpu) >>>> +{ >>>> + uint8_t fc; >>>> + struct ap_queue_status status = {}; >>>> + int ret; >>>> + /* Verify that the AP instruction are available */ >>>> + if (!ap_instructions_available()) >>>> + return -EOPNOTSUPP; >>>> + /* Verify that the guest is allowed to use AP instructions */ >>>> + if (!(vcpu->arch.sie_block->eca & ECA_APIE)) >>>> + return -EOPNOTSUPP; >>>> + /* Verify that the function code is AQIC */ >>>> + fc = vcpu->run->s.regs.gprs[0] >> 24; >>>> + /* We do not want to change the behavior we had before this >>>> patch*/ >>>> + if (fc != 0x03) >>>> + return -EOPNOTSUPP; >>>> + >>>> + /* PQAP instructions are allowed for guest kernel only */ >>>> + if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE) >>>> + return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP); >>>> + /* AQIC instruction is allowed only if facility 65 is available */ >>>> + if (!test_kvm_facility(vcpu->kvm, 65)) >>>> + return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); >>>> + /* Verify that the hook callback is registered and call it */ >>>> + if (vcpu->kvm->arch.crypto.pqap_hook) { >>>> + if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner)) >>>> + return -EOPNOTSUPP; >>>> + ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu); >>>> + module_put(vcpu->kvm->arch.crypto.pqap_hook->owner); >>>> + return ret; >>>> + } >>>> + /* >>>> + * It is the duty of the vfio_driver to register a hook >>>> + * If it does not and we get an exception on AQIC we must >>>> + * guess that there is no vfio_ap_driver at all and no one >>>> + * to handle the guests's CRYCB and the CRYCB is empty. >>>> + */ >>>> + status.response_code = 0x01; >>> >>> I'm still confused here, sorry. From previous discussions I recall that >>> this indicates "no crypto device" (please correct me if I'm wrong.) >>> >>> Before this patch, we had: >>> - guest issues PQAP/AQIC -> drop to userspace >>> >>> With a correct implementation, we get: >>> - guest issues PQAP/AQIC -> callback does what needs to be done >>> >>> With an incorrect implementation (no callback), we get: >>> - guest issues PQAP/AQIC -> guest gets response code 0x01 >>> >>> Why not drop to userspace in that case? >> >> This is what I had in the previous patches. >> Hum, I do not remember which discussion lead me to modify this. >> >> Anyway, now that you put the finger on this problem, I think the >> problem is worse. >> >> The behavior with old / new Linux, vfio driver and qemu is: >> >> LINUX VFIO_AP QEMU PGM >> OLD x x OPERATION >> NEW - OLD SPECIFICATION >> NEW - NEW/aqic=off SPECIFICATION >> NEW x NEW/aqic=on - >> >> x = whatever >> - = absent/none >> >> So yes there is a change in behavior for the userland for the case >> QEMU do not set the AQIC facility 65, OLD QEMU or NEW QEMU wanting to >> behave like an older one. >> >> I fear we have the same problem with the privileged operation... >> >> For the last case, when the kvm_facility(65) is set, the explication >> is the following: >> >> This is related to the handling of PQAP AQIC which is now authorized >> by this patch series. >> If we authorize PQAP AQIC, by setting the bit for facility 65, the >> guest can use this instruction. >> If the instruction follows the specifications we must answer something >> realistic and since there is nothing in the CRYCB (no driver) we >> answer that there is no queue. >> >> Conclusion: we must handle this in userland, it will have the benefit >> to keep old behavior when there is no callback. >> OLD QEMU will not see change as they will not set aqic facility >> NEW QEMU will handle this correctly. >> > > Sorry, wrong conclusion, handling this in userland will bring us much > too far if we want to answer correctly for the case the hook is not > there but QEMU accepted the facility for AQIC. Sorry, forget it, I was tired. Pierre > > The alternative is easier, we just continue to respond with the > OPERATION exception here and only handle the specification and > privileged exception cases in QEMU and in the hook. > > So, I think the discussion will go on until you come back :) > > Regards, > Pierre >
On 15/03/2019 18:28, Halil Pasic wrote: > On Fri, 15 Mar 2019 14:26:34 +0100 > Pierre Morel <pmorel@linux.ibm.com> wrote: > >> On 15/03/2019 11:20, Cornelia Huck wrote: >>> On Wed, 13 Mar 2019 17:04:58 +0100 >>> Pierre Morel <pmorel@linux.ibm.com> wrote: >>> >>>> +/* >>>> + * handle_pqap: Handling pqap interception >>>> + * @vcpu: the vcpu having issue the pqap instruction >>>> + * >>>> + * We now support PQAP/AQIC instructions and we need to correctly >>>> + * answer the guest even if no dedicated driver's hook is available. >>>> + * >>>> + * The intercepting code calls a dedicated callback for this instruction >>>> + * if a driver did register one in the CRYPTO satellite of the >>>> + * SIE block. >>>> + * >>>> + * For PQAP/AQIC instructions only, verify privilege and specifications. >>>> + * >>>> + * If no callback available, the queues are not available, return this to >>>> + * the caller. >>>> + * Else return the value returned by the callback. >>>> + */ >>>> +static int handle_pqap(struct kvm_vcpu *vcpu) >>>> +{ >>>> + uint8_t fc; >>>> + struct ap_queue_status status = {}; >>>> + int ret; >>>> + /* Verify that the AP instruction are available */ >>>> + if (!ap_instructions_available()) >>>> + return -EOPNOTSUPP; >>>> + /* Verify that the guest is allowed to use AP instructions */ >>>> + if (!(vcpu->arch.sie_block->eca & ECA_APIE)) >>>> + return -EOPNOTSUPP; >>>> + /* Verify that the function code is AQIC */ >>>> + fc = vcpu->run->s.regs.gprs[0] >> 24; >>>> + /* We do not want to change the behavior we had before this patch*/ >>>> + if (fc != 0x03) >>>> + return -EOPNOTSUPP; >>>> + >>>> + /* PQAP instructions are allowed for guest kernel only */ >>>> + if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE) >>>> + return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP); >>>> + /* AQIC instruction is allowed only if facility 65 is available */ >>>> + if (!test_kvm_facility(vcpu->kvm, 65)) >>>> + return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); >>>> + /* Verify that the hook callback is registered and call it */ >>>> + if (vcpu->kvm->arch.crypto.pqap_hook) { >>>> + if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner)) >>>> + return -EOPNOTSUPP; >>>> + ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu); >>>> + module_put(vcpu->kvm->arch.crypto.pqap_hook->owner); >>>> + return ret; >>>> + } >>>> + /* >>>> + * It is the duty of the vfio_driver to register a hook >>>> + * If it does not and we get an exception on AQIC we must >>>> + * guess that there is no vfio_ap_driver at all and no one >>>> + * to handle the guests's CRYCB and the CRYCB is empty. >>>> + */ >>>> + status.response_code = 0x01; >>> >>> I'm still confused here, sorry. From previous discussions I recall that >>> this indicates "no crypto device" (please correct me if I'm wrong.) >>> >>> Before this patch, we had: >>> - guest issues PQAP/AQIC -> drop to userspace >>> >>> With a correct implementation, we get: >>> - guest issues PQAP/AQIC -> callback does what needs to be done >>> >>> With an incorrect implementation (no callback), we get: >>> - guest issues PQAP/AQIC -> guest gets response code 0x01 >>> >>> Why not drop to userspace in that case? >> >> This is what I had in the previous patches. >> Hum, I do not remember which discussion lead me to modify this. >> >> Anyway, now that you put the finger on this problem, I think the problem >> is worse. >> >> The behavior with old / new Linux, vfio driver and qemu is: >> >> LINUX VFIO_AP QEMU PGM >> OLD x x OPERATION > > Isn't OPERATION a bad answer if ap=on? It should not happen > with a well behaved guest because facility 65 is not indicated, > but if it does, I guess we give the wrong answer. It is clearly wrong but we can not change the past :) > >> NEW - OLD SPECIFICATION >> NEW - NEW/aqic=off SPECIFICATION >> NEW x NEW/aqic=on - >> > > AFAICT with LINUX == NEW we get the correct answer. OPERATION exception > is only good if ap=off. Exact. > >> x = whatever >> - = absent/none >> >> So yes there is a change in behavior for the userland for the case QEMU >> do not set the AQIC facility 65, OLD QEMU or NEW QEMU wanting to behave >> like an older one. >> >> I fear we have the same problem with the privileged operation... >> > > IMHO this boils down to: > * either OLD QEMU or > * OLD LINUX > should have taken care of handling the mandatory intercept for PQAP/AQIC > if ap=on (i.e. guest has AP instructions), and does not have facility 65 > which was the case for OLD. yes > > Things get complicated when one considers that ECA.28 is an effective > control. I don't think so, ECA_28 is not really a problem. We do not propagate ECA_AIV in VSIE and ECA_AIV is tested in the vfio driver to support GISA. So that the guest 3 will not support interrupt. > >> For the last case, when the kvm_facility(65) is set, the explication is >> the following: >> >> This is related to the handling of PQAP AQIC which is now authorized by >> this patch series. >> If we authorize PQAP AQIC, by setting the bit for facility 65, the guest >> can use this instruction. >> If the instruction follows the specifications we must answer something >> realistic and since there is nothing in the CRYCB (no driver) we answer >> that there is no queue. >> >> Conclusion: we must handle this in userland, it will have the benefit >> to keep old behavior when there is no callback. >> OLD QEMU will not see change as they will not set aqic facility > > That would mean we remain quirky. Yes, the alternative is: 1) We do things right but this mean we change the ABI (SPECIFICATION instead of OPERATION) I thing this is the best thing to do, it is the implementation proposed by this patch where all is done in Kernel, so that we are right what ever the userland user is (QEMU or other). 2) We want to preserve the old ABI for old QEMU Then I proposed the implementation here under. My personal opinion, is that we should change the ABI and do things right now. We should also do it right for TAPQ with t bit set. I remember Christian already warned about this but we did not implement it. > >> NEW QEMU will handle this correctly. >> >> In this case we also do not need to handle all other tests here but can >> move it to the callback as Tony wanted. >> >> Would you agree with something simple like: >> >> static int handle_pqap(struct kvm_vcpu *vcpu) >> { >> int ret = -EOPNOTSUPP; >> >> /* Verify that the hook callback is registered and call it */ >> if (pqap_hook) >> if (try_module_get(pqap_hook->owner)) { >> ret = pqap_hook->hook(vcpu); >> module_put(pqap_hook->owner); >> } >> return ret; >> } >> >> All other tests in QEMU and in the callback. >> > > You stated in another email that the conclusion is wrong. I'm not sure Forget it, I do not understand what I wanted to say there /o\ . > what is the cleanest solution here. This effective control thing does > make my head spin. As I said I do not thing any effective control do interfere here. IMHO Alternative 1 is the cleanest solution Regards, Pierre
On Tue, 19 Mar 2019 11:01:44 +0100 Pierre Morel <pmorel@linux.ibm.com> wrote: > On 15/03/2019 18:28, Halil Pasic wrote: [..] > > > > Things get complicated when one considers that ECA.28 is an effective > > control. > > I don't think so, ECA_28 is not really a problem. > We do not propagate ECA_AIV in VSIE and ECA_AIV is tested in the vfio > driver to support GISA. > So that the guest 3 will not support interrupt. > That was not my concern, but while we are at it... I guess you refer to the check in handle_pqap(). That seems to do -EOPNOTSUPP, i.e. got to userspace, i.e. with today's QEMU operation exception. Which does not seem right. My concern was the following. Let assume ECA.28 == 1 and EECA.28 == 0 != 1 and guest issues a PQAP (for simplicity AQIC). Currently I guess we take a 0x04 interception and go to userspace, which may or may not be the best thing to do. With this patch we would take a 0x04, but (opposed to before) if guest does not have facility 65 we go with a specification exception. Operation exception should however take priority over this kind of specification exception. So basically everything except PQAP/AQIC would give you and operation exception (with current QEMU), but PQAP/AQIC would give you a specification exception. Which is wrong! AFAICT there is no way to tell if we got a 04 interception because EECA.28 != 1 (and ECA.28 == 1) and FW won't interpret the AP instructions for us, or because it PQAP/AQIC is a mandatory intercept. In other words I don't see a way to tell if EECA.28 is 1 when interpreting PQAP/AQIC. Do you agree? [..] > > Yes, the alternative is: > > 1) We do things right but this mean we change the ABI (SPECIFICATION > instead of OPERATION) > > I thing this is the best thing to do, it is the implementation > proposed by this patch where all is done in Kernel, so that we are > right what ever the userland user is (QEMU or other). > > 2) We want to preserve the old ABI for old QEMU > Then I proposed the implementation here under. > > > My personal opinion, is that we should change the ABI and do things > right now. I tend to agree. Giving an operation exception instead of a specification exception is a bug. If it is a kernel or qemu bug it ain't clear to me at the moment. > We should also do it right for TAPQ with t bit set. I remember > Christian already warned about this but we did not implement it. > Yes, I have some blurry memories of something similar myself. I wonder if there was a reason, or did we just forget to address this issue. Regards, Halil
On 19/03/2019 15:54, Halil Pasic wrote: > On Tue, 19 Mar 2019 11:01:44 +0100 > Pierre Morel <pmorel@linux.ibm.com> wrote: > >> On 15/03/2019 18:28, Halil Pasic wrote: > > [..] > >>> >>> Things get complicated when one considers that ECA.28 is an effective >>> control. >> >> I don't think so, ECA_28 is not really a problem. >> We do not propagate ECA_AIV in VSIE and ECA_AIV is tested in the vfio >> driver to support GISA. >> So that the guest 3 will not support interrupt. >> > > That was not my concern, but while we are at it... I guess you refer to > the check in handle_pqap(). That seems to do -EOPNOTSUPP, i.e. got to > userspace, i.e. with today's QEMU operation exception. Which does not > seem right. We already discussed this. no? > > My concern was the following. Let assume > ECA.28 == 1 and EECA.28 == 0 != 1 > and guest issues a PQAP (for simplicity AQIC). > > Currently I guess we take a 0x04 interception and go to userspace, which > may or may not be the best thing to do. > > With this patch we would take a 0x04, but (opposed to before) if guest > does not have facility 65 we go with a specification exception. This is not right. We return -EOPNOTSUPP which will be intercepted by QEMU which will report an OPERATION exception as before. > Operation exception should however take priority over this kind of > specification exception. So basically everything except PQAP/AQIC would > give you and operation exception (with current QEMU), but PQAP/AQIC would > give you a specification exception. Which is wrong! > > AFAICT there is no way to tell if we got a 04 interception because > EECA.28 != 1 (and ECA.28 == 1) and FW won't interpret the AP > instructions for us, or because it PQAP/AQIC is a mandatory intercept. > In other words I don't see a way to tell if EECA.28 is 1 when > interpreting PQAP/AQIC. > > Do you agree? No. EECA = HOST_ECA & GUEST_ECA after we made sure that AP instructions are available, HOST_ECA=1 (vcpu->arch.sie_block->eca & ECA_APIE) gives us the answer. In the case HOST_ECA=0 we always go to userland as before. > > [..] > >> >> Yes, the alternative is: >> >> 1) We do things right but this mean we change the ABI (SPECIFICATION >> instead of OPERATION) >> >> I thing this is the best thing to do, it is the implementation >> proposed by this patch where all is done in Kernel, so that we are >> right what ever the userland user is (QEMU or other). >> >> 2) We want to preserve the old ABI for old QEMU >> Then I proposed the implementation here under. >> >> >> My personal opinion, is that we should change the ABI and do things >> right now. > > I tend to agree. Giving an operation exception instead of a specification > exception is a bug. If it is a kernel or qemu bug it ain't clear to me > at the moment. > >> We should also do it right for TAPQ with t bit set. I remember >> Christian already warned about this but we did not implement it. >> > > Yes, I have some blurry memories of something similar myself. I wonder > if there was a reason, or did we just forget to address this issue. I will integrate it in the next iteration too, even it is not IRQ, the PQAP hook patch can be more general. Regards, Pierre > > Regards, > Halil >
On 19/03/2019 18:07, Pierre Morel wrote: > On 19/03/2019 15:54, Halil Pasic wrote: >> On Tue, 19 Mar 2019 11:01:44 +0100 >> Pierre Morel <pmorel@linux.ibm.com> wrote: >> >>> On 15/03/2019 18:28, Halil Pasic wrote: >> ...snip... >> >>> We should also do it right for TAPQ with t bit set. I remember >>> Christian already warned about this but we did not implement it. >>> >> >> Yes, I have some blurry memories of something similar myself. I wonder >> if there was a reason, or did we just forget to address this issue. > > > I will integrate it in the next iteration too, even it is not IRQ, the > PQAP hook patch can be more general. After all, I will not do this, I remember the reason why we did not do it once: simply it is not intercepted until we enable it. So we will handle it when we enable the TAPQ-t interception Regards, Pierre
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index a496276..624460b 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -18,6 +18,7 @@ #include <linux/kvm_host.h> #include <linux/kvm.h> #include <linux/seqlock.h> +#include <linux/module.h> #include <asm/debug.h> #include <asm/cpu.h> #include <asm/fpu/api.h> @@ -721,8 +722,15 @@ struct kvm_s390_cpu_model { unsigned short ibc; }; +struct kvm_s390_module_hook { + int (*hook)(struct kvm_vcpu *vcpu); + void *data; + struct module *owner; +}; + struct kvm_s390_crypto { struct kvm_s390_crypto_cb *crycb; + struct kvm_s390_module_hook *pqap_hook; __u32 crycbd; __u8 aes_kw; __u8 dea_kw; diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c index 8679bd7..72f683a 100644 --- a/arch/s390/kvm/priv.c +++ b/arch/s390/kvm/priv.c @@ -27,6 +27,7 @@ #include <asm/io.h> #include <asm/ptrace.h> #include <asm/sclp.h> +#include <asm/ap.h> #include "gaccess.h" #include "kvm-s390.h" #include "trace.h" @@ -592,6 +593,65 @@ static int handle_io_inst(struct kvm_vcpu *vcpu) } } +/* + * handle_pqap: Handling pqap interception + * @vcpu: the vcpu having issue the pqap instruction + * + * We now support PQAP/AQIC instructions and we need to correctly + * answer the guest even if no dedicated driver's hook is available. + * + * The intercepting code calls a dedicated callback for this instruction + * if a driver did register one in the CRYPTO satellite of the + * SIE block. + * + * For PQAP/AQIC instructions only, verify privilege and specifications. + * + * If no callback available, the queues are not available, return this to + * the caller. + * Else return the value returned by the callback. + */ +static int handle_pqap(struct kvm_vcpu *vcpu) +{ + uint8_t fc; + struct ap_queue_status status = {}; + int ret; + /* Verify that the AP instruction are available */ + if (!ap_instructions_available()) + return -EOPNOTSUPP; + /* Verify that the guest is allowed to use AP instructions */ + if (!(vcpu->arch.sie_block->eca & ECA_APIE)) + return -EOPNOTSUPP; + /* Verify that the function code is AQIC */ + fc = vcpu->run->s.regs.gprs[0] >> 24; + /* We do not want to change the behavior we had before this patch*/ + if (fc != 0x03) + return -EOPNOTSUPP; + + /* PQAP instructions are allowed for guest kernel only */ + if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE) + return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP); + /* AQIC instruction is allowed only if facility 65 is available */ + if (!test_kvm_facility(vcpu->kvm, 65)) + return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); + /* Verify that the hook callback is registered and call it */ + if (vcpu->kvm->arch.crypto.pqap_hook) { + if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner)) + return -EOPNOTSUPP; + ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu); + module_put(vcpu->kvm->arch.crypto.pqap_hook->owner); + return ret; + } + /* + * It is the duty of the vfio_driver to register a hook + * If it does not and we get an exception on AQIC we must + * guess that there is no vfio_ap_driver at all and no one + * to handle the guests's CRYCB and the CRYCB is empty. + */ + status.response_code = 0x01; + memcpy(&vcpu->run->s.regs.gprs[1], &status, sizeof(status)); + return 0; +} + static int handle_stfl(struct kvm_vcpu *vcpu) { int rc; @@ -878,6 +938,8 @@ int kvm_s390_handle_b2(struct kvm_vcpu *vcpu) return handle_sthyi(vcpu); case 0x7d: return handle_stsi(vcpu); + case 0xaf: + return handle_pqap(vcpu); case 0xb1: return handle_stfl(vcpu); case 0xb2: diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index 76b7f98..a910be1 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -16,6 +16,7 @@ #include <linux/mdev.h> #include <linux/delay.h> #include <linux/mutex.h> +#include <linux/kvm_host.h> #include "ap_bus.h" @@ -81,6 +82,7 @@ struct ap_matrix_mdev { struct ap_matrix matrix; struct notifier_block group_notifier; struct kvm *kvm; + struct kvm_s390_module_hook pqap_hook; }; extern int vfio_ap_mdev_register(void);
We prepare the interception of the PQAP/AQIC instruction for the case the AQIC facility is enabled in the guest. We add a callback inside the KVM arch structure for s390 for a VFIO driver to handle a specific response to the PQAP instruction with the AQIC command and only this command. The preceding behavior for other commands should not change. We inject the correct exceptions from inside KVM for the case the callback is not initialized, which happens when the vfio_ap driver is not loaded. It is the duty of the vfio_driver to setup a pqap callback inside the crypto structure. If the callback has been setup we call it. If not we setup an answer considering that no queue is available for the guest when no callback has been setup. We do consider the responsability of the driver to always initialize the PQAP callback if it defines queues by initializing the CRYCB for a guest. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> --- arch/s390/include/asm/kvm_host.h | 8 +++++ arch/s390/kvm/priv.c | 62 +++++++++++++++++++++++++++++++++++ drivers/s390/crypto/vfio_ap_private.h | 2 ++ 3 files changed, 72 insertions(+)