Message ID | 20221213154437.15480-8-akrowiak@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | improve AP queue reset processing | expand |
On 2022-12-13 16:44, Tony Krowiak wrote: > Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an > error > not handled by a case statement. > > Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> > --- > drivers/s390/crypto/vfio_ap_ops.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/s390/crypto/vfio_ap_ops.c > b/drivers/s390/crypto/vfio_ap_ops.c > index e80c5a6b91be..2dd8db9ddb39 100644 > --- a/drivers/s390/crypto/vfio_ap_ops.c > +++ b/drivers/s390/crypto/vfio_ap_ops.c > @@ -1676,7 +1676,7 @@ static int vfio_ap_mdev_reset_queue(struct > vfio_ap_queue *q) > "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n", > AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn), > status.response_code); > - return -EIO; > + break; > } > > vfio_ap_free_aqic_resources(q); Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
On Tue, 13 Dec 2022 10:44:37 -0500 Tony Krowiak <akrowiak@linux.ibm.com> wrote: > Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error > not handled by a case statement. Why? I'm afraid this is a step in the wrong direction... > > Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> > --- > drivers/s390/crypto/vfio_ap_ops.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c > index e80c5a6b91be..2dd8db9ddb39 100644 > --- a/drivers/s390/crypto/vfio_ap_ops.c > +++ b/drivers/s390/crypto/vfio_ap_ops.c > @@ -1676,7 +1676,7 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q) > "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n", > AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn), > status.response_code); > - return -EIO; > + break; > } > > vfio_ap_free_aqic_resources(q);
On 12/19/22 9:10 AM, Halil Pasic wrote: > On Tue, 13 Dec 2022 10:44:37 -0500 > Tony Krowiak <akrowiak@linux.ibm.com> wrote: > >> Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error >> not handled by a case statement. > Why? If the ZAPQ failed, then instructions submitted to the same queue will likewise fail. Are you saying it's not safe to assume, therefore, that interrupts will not be occurring? > > I'm afraid this is a step in the wrong direction... Please explain why. > >> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> >> --- >> drivers/s390/crypto/vfio_ap_ops.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c >> index e80c5a6b91be..2dd8db9ddb39 100644 >> --- a/drivers/s390/crypto/vfio_ap_ops.c >> +++ b/drivers/s390/crypto/vfio_ap_ops.c >> @@ -1676,7 +1676,7 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q) >> "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n", >> AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn), >> status.response_code); >> - return -EIO; >> + break; >> } >> >> vfio_ap_free_aqic_resources(q);
On Tue, 20 Dec 2022 09:33:03 -0500 Anthony Krowiak <akrowiak@linux.ibm.com> wrote: > On 12/19/22 9:10 AM, Halil Pasic wrote: > > On Tue, 13 Dec 2022 10:44:37 -0500 > > Tony Krowiak <akrowiak@linux.ibm.com> wrote: > > > >> Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error > >> not handled by a case statement. > > Why? > > > If the ZAPQ failed, then instructions submitted to the same queue will > likewise fail. Are you saying it's not safe to assume, therefore, that > interrupts will not be occurring? Right. We are talking about the default branch here, and I suppose, the codes where we know that it is safe to assume that no reset is needed handled separately (AP_RESPONSE_DECONFIGURED). I'm not convinced that if we take the default branch we can safely assume, that we won't see any interrupts. For example consider hot-unplug as done by KVM. We modify the CRYCB/APCB with all vCPUS take out of SIE, but we don't keep the vCPUs out of SIE until the resets of the unpugged queues are done, and we don't do any extra interrupt disablement with all vCPUs keept out of SIE. So I believe currently there may be a window where the guest can observe a 01 but the interrupts are still live. That may be a bug, but IMHO it ain't clear cut. But it is not just about interrupts. Before we returned an error code, which gets propagated to the userspace if this reset was triggered via the ioctl. With this change, ret seems to be uninitialized when returned if we take the code path which you change here. So we would end up logging a warning and returning garbage? One could also debate, whether RCs introduced down the road can affect the logic here (even if the statement "if we see an RC other that 00 and 02, we don't need to pursue a reset any further, and interrpts are disabled" were to be guaranteed to be true now, new RCs could theoretically mess this up). > > > > > > I'm afraid this is a step in the wrong direction... > > > Please explain why. > Sorry, I kept this brief because IMHO it is your job to tell us why this needs to be changed. But I gave in, as you see. Regards, Halil
On 12/20/22 12:24 PM, Halil Pasic wrote: > On Tue, 20 Dec 2022 09:33:03 -0500 > Anthony Krowiak <akrowiak@linux.ibm.com> wrote: > >> On 12/19/22 9:10 AM, Halil Pasic wrote: >>> On Tue, 13 Dec 2022 10:44:37 -0500 >>> Tony Krowiak <akrowiak@linux.ibm.com> wrote: >>> >>>> Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error >>>> not handled by a case statement. >>> Why? >> >> If the ZAPQ failed, then instructions submitted to the same queue will >> likewise fail. Are you saying it's not safe to assume, therefore, that >> interrupts will not be occurring? > Right. We are talking about the default branch here, and I suppose, the > codes where we know that it is safe to assume that no reset is needed > handled separately (AP_RESPONSE_DECONFIGURED). > > I'm not convinced that if we take the default branch we can safely > assume, that we won't see any interrupts. > > For example consider hot-unplug as done by KVM. We modify the > CRYCB/APCB with all vCPUS take out of SIE, but we don't keep > the vCPUs out of SIE until the resets of the unpugged queues > are done, and we don't do any extra interrupt disablement > with all vCPUs keept out of SIE. So I believe currently there > may be a window where the guest can observe a 01 but the > interrupts are still live. That may be a bug, but IMHO it ain't clear > cut. > > But it is not just about interrupts. Before we returned an error > code, which gets propagated to the userspace if this reset was > triggered via the ioctl. > > With this change, ret seems to be uninitialized when returned > if we take the code path which you change here. So we would > end up logging a warning and returning garbage? That was an oversight. The -EIO value was returned previously, so the ret = -EIO should be set in the default case. > > One could also debate, whether RCs introduced down the road > can affect the logic here (even if the statement "if we > see an RC other that 00 and 02, we don't need to pursue a > reset any further, and interrpts are disabled" were to be > guaranteed to be true now, new RCs could theoretically mess > this up). I think that would be the case regardless of this change. If new RCs are introduced, this function ought to be revisited anyway and appropriate changes made. > > >> >>> I'm afraid this is a step in the wrong direction... >> >> Please explain why. >> > Sorry, I kept this brief because IMHO it is your job to tell us why > this needs to be changed. But I gave in, as you see. > > Regards, > Halil
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index e80c5a6b91be..2dd8db9ddb39 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1676,7 +1676,7 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q) "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n", AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn), status.response_code); - return -EIO; + break; } vfio_ap_free_aqic_resources(q);
Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error not handled by a case statement. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> --- drivers/s390/crypto/vfio_ap_ops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)