Message ID | 20220120142259.120189-1-hreitz@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] ide: Increment BB in-flight counter for TRIM BH | expand |
On Thu, Jan 20, 2022 at 9:23 AM Hanna Reitz <hreitz@redhat.com> wrote: > > When we still have an AIOCB registered for DMA operations, we try to > settle the respective operation by draining the BlockBackend associated > with the IDE device. > > However, this assumes that every DMA operation is associated with an > increment of the BlockBackend’s in-flight counter (e.g. through some > ongoing I/O operation), so that draining the BB until its in-flight > counter reaches 0 will settle all DMA operations. That is not the case: > For TRIM, the guest can issue a zero-length operation that will not > result in any I/O operation forwarded to the BlockBackend, and also not > increment the in-flight counter in any other way. In such a case, > blk_drain() will be a no-op if no other operations are in flight. > > It is clear that if blk_drain() is a no-op, the value of > s->bus->dma->aiocb will not change between checking it in the `if` > condition and asserting that it is NULL after blk_drain(). > > The particular problem is that ide_issue_trim() creates a BH > (ide_trim_bh_cb()) to settle the TRIM request: iocb->common.cb() is > ide_dma_cb(), which will either create a new request, or find the > transfer to be done and call ide_set_inactive(), which clears > s->bus->dma->aiocb. Therefore, the blk_drain() must wait for > ide_trim_bh_cb() to run, which currently it will not always do. > > To fix this issue, we increment the BlockBackend's in-flight counter > when the TRIM operation begins (in ide_issue_trim(), when the > ide_trim_bh_cb() BH is created) and decrement it when ide_trim_bh_cb() > is done. > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2029980 > Suggested-by: Paolo Bonzini <pbonzini@redhat.com> > Signed-off-by: Hanna Reitz <hreitz@redhat.com> > --- > v1: > https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00024.html > > v2: > - Increment BB’s in-flight counter while the BH is active so that > blk_drain() will poll until the BH is done, as suggested by Paolo > > (No git-backport-diff, because this patch was basically completely > rewritten, so it wouldn’t be worth it.) > --- > hw/ide/core.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/hw/ide/core.c b/hw/ide/core.c > index e28f8aad61..15138225be 100644 > --- a/hw/ide/core.c > +++ b/hw/ide/core.c > @@ -433,12 +433,16 @@ static const AIOCBInfo trim_aiocb_info = { > static void ide_trim_bh_cb(void *opaque) > { > TrimAIOCB *iocb = opaque; > + BlockBackend *blk = iocb->s->blk; > > iocb->common.cb(iocb->common.opaque, iocb->ret); > > qemu_bh_delete(iocb->bh); > iocb->bh = NULL; > qemu_aio_unref(iocb); > + > + /* Paired with an increment in ide_issue_trim() */ > + blk_dec_in_flight(blk); > } > > static void ide_issue_trim_cb(void *opaque, int ret) > @@ -508,6 +512,9 @@ BlockAIOCB *ide_issue_trim( > IDEState *s = opaque; > TrimAIOCB *iocb; > > + /* Paired with a decrement in ide_trim_bh_cb() */ > + blk_inc_in_flight(s->blk); > + > iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque); > iocb->s = s; > iocb->bh = qemu_bh_new(ide_trim_bh_cb, iocb); > -- > 2.34.1 > Oh, this *wasn't* the same bug I thought it was. There's no regression test, but I will trust you (and Paolo) that this solves the bug you were seeing. It makes sense. Reviewed-by: John Snow <jsnow@redhat.com> Tested-by: John Snow <jsnow@redhat.com>
On 21.01.22 19:47, John Snow wrote: > On Thu, Jan 20, 2022 at 9:23 AM Hanna Reitz <hreitz@redhat.com> wrote: >> When we still have an AIOCB registered for DMA operations, we try to >> settle the respective operation by draining the BlockBackend associated >> with the IDE device. >> >> However, this assumes that every DMA operation is associated with an >> increment of the BlockBackend’s in-flight counter (e.g. through some >> ongoing I/O operation), so that draining the BB until its in-flight >> counter reaches 0 will settle all DMA operations. That is not the case: >> For TRIM, the guest can issue a zero-length operation that will not >> result in any I/O operation forwarded to the BlockBackend, and also not >> increment the in-flight counter in any other way. In such a case, >> blk_drain() will be a no-op if no other operations are in flight. >> >> It is clear that if blk_drain() is a no-op, the value of >> s->bus->dma->aiocb will not change between checking it in the `if` >> condition and asserting that it is NULL after blk_drain(). >> >> The particular problem is that ide_issue_trim() creates a BH >> (ide_trim_bh_cb()) to settle the TRIM request: iocb->common.cb() is >> ide_dma_cb(), which will either create a new request, or find the >> transfer to be done and call ide_set_inactive(), which clears >> s->bus->dma->aiocb. Therefore, the blk_drain() must wait for >> ide_trim_bh_cb() to run, which currently it will not always do. >> >> To fix this issue, we increment the BlockBackend's in-flight counter >> when the TRIM operation begins (in ide_issue_trim(), when the >> ide_trim_bh_cb() BH is created) and decrement it when ide_trim_bh_cb() >> is done. >> >> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2029980 >> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> >> Signed-off-by: Hanna Reitz <hreitz@redhat.com> >> --- >> v1: >> https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00024.html >> >> v2: >> - Increment BB’s in-flight counter while the BH is active so that >> blk_drain() will poll until the BH is done, as suggested by Paolo >> >> (No git-backport-diff, because this patch was basically completely >> rewritten, so it wouldn’t be worth it.) >> --- >> hw/ide/core.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/hw/ide/core.c b/hw/ide/core.c >> index e28f8aad61..15138225be 100644 >> --- a/hw/ide/core.c >> +++ b/hw/ide/core.c >> @@ -433,12 +433,16 @@ static const AIOCBInfo trim_aiocb_info = { >> static void ide_trim_bh_cb(void *opaque) >> { >> TrimAIOCB *iocb = opaque; >> + BlockBackend *blk = iocb->s->blk; >> >> iocb->common.cb(iocb->common.opaque, iocb->ret); >> >> qemu_bh_delete(iocb->bh); >> iocb->bh = NULL; >> qemu_aio_unref(iocb); >> + >> + /* Paired with an increment in ide_issue_trim() */ >> + blk_dec_in_flight(blk); >> } >> >> static void ide_issue_trim_cb(void *opaque, int ret) >> @@ -508,6 +512,9 @@ BlockAIOCB *ide_issue_trim( >> IDEState *s = opaque; >> TrimAIOCB *iocb; >> >> + /* Paired with a decrement in ide_trim_bh_cb() */ >> + blk_inc_in_flight(s->blk); >> + >> iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque); >> iocb->s = s; >> iocb->bh = qemu_bh_new(ide_trim_bh_cb, iocb); >> -- >> 2.34.1 >> > Oh, this *wasn't* the same bug I thought it was. > > There's no regression test, but I will trust you (and Paolo) that this > solves the bug you were seeing. It makes sense. There is one in the BZ linked, but I don’t know where we’d put it into the qemu tree... I’ve explained in v1 (https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00024.html) how I didn’t find a way to write a qtest for this, and so resorted to writing boot sector code to reproduce the assertion failure. Now, we could put that as a sample image into the iotests, but that’d just be... wrong. (Is there a place where something like this would belong?) > Reviewed-by: John Snow <jsnow@redhat.com> > Tested-by: John Snow <jsnow@redhat.com> Thanks! Hanna
On 1/20/22 15:22, Hanna Reitz wrote: > > Buglink:https://bugzilla.redhat.com/show_bug.cgi?id=2029980 > Suggested-by: Paolo Bonzini<pbonzini@redhat.com> > Signed-off-by: Hanna Reitz<hreitz@redhat.com> > --- > v1: > https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00024.html > > v2: > - Increment BB’s in-flight counter while the BH is active so that > blk_drain() will poll until the BH is done, as suggested by Paolo > > (No git-backport-diff, because this patch was basically completely > rewritten, so it wouldn’t be worth it.) FWIW you can put a Supersedes: <20220105111337.10366-1-hreitz@redhat.com> anywhere in the message, and patchew will gladly produce the crossdiff link https://patchew.org/QEMU/20220105111337.10366-1-hreitz@redhat.com/diff/20220120142259.120189-1-hreitz@redhat.com/ for disbelievers---or for those who want to compare the commit messages. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Paolo
On Mon, Jan 24, 2022 at 4:06 AM Hanna Reitz <hreitz@redhat.com> wrote: > > On 21.01.22 19:47, John Snow wrote: > > > > There's no regression test, but I will trust you (and Paolo) that this > > solves the bug you were seeing. It makes sense. > > There is one in the BZ linked, but I don’t know where we’d put it into > the qemu tree... I’ve explained in v1 > (https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00024.html) > how I didn’t find a way to write a qtest for this, and so resorted to > writing boot sector code to reproduce the assertion failure. Now, we > could put that as a sample image into the iotests, but that’d just be... > wrong. (Is there a place where something like this would belong?) > No idea. I guess it'd be more of an avacado-test level thing, but I'm not sure I know how to do it quickly. I'm worried there's lots of little things like this that'd be nice to test against, but "where do we put this" is a recurring problem. (Definitely not insisting on this being solved, and also wise enough to not want to volunteer.) > > Reviewed-by: John Snow <jsnow@redhat.com> > > Tested-by: John Snow <jsnow@redhat.com> > > Thanks! > > Hanna > --js
Ping (I can take it too, if you’d like, John, but you’re listed as the only maintainer for hw/ide, so... Just say the word, though!) On 20.01.22 15:22, Hanna Reitz wrote: > When we still have an AIOCB registered for DMA operations, we try to > settle the respective operation by draining the BlockBackend associated > with the IDE device. > > However, this assumes that every DMA operation is associated with an > increment of the BlockBackend’s in-flight counter (e.g. through some > ongoing I/O operation), so that draining the BB until its in-flight > counter reaches 0 will settle all DMA operations. That is not the case: > For TRIM, the guest can issue a zero-length operation that will not > result in any I/O operation forwarded to the BlockBackend, and also not > increment the in-flight counter in any other way. In such a case, > blk_drain() will be a no-op if no other operations are in flight. > > It is clear that if blk_drain() is a no-op, the value of > s->bus->dma->aiocb will not change between checking it in the `if` > condition and asserting that it is NULL after blk_drain(). > > The particular problem is that ide_issue_trim() creates a BH > (ide_trim_bh_cb()) to settle the TRIM request: iocb->common.cb() is > ide_dma_cb(), which will either create a new request, or find the > transfer to be done and call ide_set_inactive(), which clears > s->bus->dma->aiocb. Therefore, the blk_drain() must wait for > ide_trim_bh_cb() to run, which currently it will not always do. > > To fix this issue, we increment the BlockBackend's in-flight counter > when the TRIM operation begins (in ide_issue_trim(), when the > ide_trim_bh_cb() BH is created) and decrement it when ide_trim_bh_cb() > is done. > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2029980 > Suggested-by: Paolo Bonzini <pbonzini@redhat.com> > Signed-off-by: Hanna Reitz <hreitz@redhat.com> > --- > v1: > https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00024.html > > v2: > - Increment BB’s in-flight counter while the BH is active so that > blk_drain() will poll until the BH is done, as suggested by Paolo > > (No git-backport-diff, because this patch was basically completely > rewritten, so it wouldn’t be worth it.) > --- > hw/ide/core.c | 7 +++++++ > 1 file changed, 7 insertions(+)
On Tue, Feb 15, 2022 at 12:14 PM Hanna Reitz <hreitz@redhat.com> wrote: > > Ping > > (I can take it too, if you’d like, John, but you’re listed as the only > maintainer for hw/ide, so... Just say the word, though!) > Sorry, I sent you a mail off-list at the time where I said you were free to take it whenever you like. Why'd I send it off-list? I don't know.... Please feel free to send this with your next block PR. --js > On 20.01.22 15:22, Hanna Reitz wrote: > > When we still have an AIOCB registered for DMA operations, we try to > > settle the respective operation by draining the BlockBackend associated > > with the IDE device. > > > > However, this assumes that every DMA operation is associated with an > > increment of the BlockBackend’s in-flight counter (e.g. through some > > ongoing I/O operation), so that draining the BB until its in-flight > > counter reaches 0 will settle all DMA operations. That is not the case: > > For TRIM, the guest can issue a zero-length operation that will not > > result in any I/O operation forwarded to the BlockBackend, and also not > > increment the in-flight counter in any other way. In such a case, > > blk_drain() will be a no-op if no other operations are in flight. > > > > It is clear that if blk_drain() is a no-op, the value of > > s->bus->dma->aiocb will not change between checking it in the `if` > > condition and asserting that it is NULL after blk_drain(). > > > > The particular problem is that ide_issue_trim() creates a BH > > (ide_trim_bh_cb()) to settle the TRIM request: iocb->common.cb() is > > ide_dma_cb(), which will either create a new request, or find the > > transfer to be done and call ide_set_inactive(), which clears > > s->bus->dma->aiocb. Therefore, the blk_drain() must wait for > > ide_trim_bh_cb() to run, which currently it will not always do. > > > > To fix this issue, we increment the BlockBackend's in-flight counter > > when the TRIM operation begins (in ide_issue_trim(), when the > > ide_trim_bh_cb() BH is created) and decrement it when ide_trim_bh_cb() > > is done. > > > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2029980 > > Suggested-by: Paolo Bonzini <pbonzini@redhat.com> > > Signed-off-by: Hanna Reitz <hreitz@redhat.com> > > --- > > v1: > > https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00024.html > > > > v2: > > - Increment BB’s in-flight counter while the BH is active so that > > blk_drain() will poll until the BH is done, as suggested by Paolo > > > > (No git-backport-diff, because this patch was basically completely > > rewritten, so it wouldn’t be worth it.) > > --- > > hw/ide/core.c | 7 +++++++ > > 1 file changed, 7 insertions(+) >
On 17.02.22 22:02, John Snow wrote: > On Tue, Feb 15, 2022 at 12:14 PM Hanna Reitz <hreitz@redhat.com> wrote: >> Ping >> >> (I can take it too, if you’d like, John, but you’re listed as the only >> maintainer for hw/ide, so... Just say the word, though!) >> > Sorry, I sent you a mail off-list at the time where I said you were > free to take it whenever you like. Why'd I send it off-list? I don't > know.... > > Please feel free to send this with your next block PR. Oh, oops. You’re right. I seemed to remember you saying something along those lines, but when I tried to support my memory with some factual email, I didn’t immediately see it, because I sorted that off-list mail somewhere else than where I was looking... Thanks, will do! Hanna
diff --git a/hw/ide/core.c b/hw/ide/core.c index e28f8aad61..15138225be 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -433,12 +433,16 @@ static const AIOCBInfo trim_aiocb_info = { static void ide_trim_bh_cb(void *opaque) { TrimAIOCB *iocb = opaque; + BlockBackend *blk = iocb->s->blk; iocb->common.cb(iocb->common.opaque, iocb->ret); qemu_bh_delete(iocb->bh); iocb->bh = NULL; qemu_aio_unref(iocb); + + /* Paired with an increment in ide_issue_trim() */ + blk_dec_in_flight(blk); } static void ide_issue_trim_cb(void *opaque, int ret) @@ -508,6 +512,9 @@ BlockAIOCB *ide_issue_trim( IDEState *s = opaque; TrimAIOCB *iocb; + /* Paired with a decrement in ide_trim_bh_cb() */ + blk_inc_in_flight(s->blk); + iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque); iocb->s = s; iocb->bh = qemu_bh_new(ide_trim_bh_cb, iocb);
When we still have an AIOCB registered for DMA operations, we try to settle the respective operation by draining the BlockBackend associated with the IDE device. However, this assumes that every DMA operation is associated with an increment of the BlockBackend’s in-flight counter (e.g. through some ongoing I/O operation), so that draining the BB until its in-flight counter reaches 0 will settle all DMA operations. That is not the case: For TRIM, the guest can issue a zero-length operation that will not result in any I/O operation forwarded to the BlockBackend, and also not increment the in-flight counter in any other way. In such a case, blk_drain() will be a no-op if no other operations are in flight. It is clear that if blk_drain() is a no-op, the value of s->bus->dma->aiocb will not change between checking it in the `if` condition and asserting that it is NULL after blk_drain(). The particular problem is that ide_issue_trim() creates a BH (ide_trim_bh_cb()) to settle the TRIM request: iocb->common.cb() is ide_dma_cb(), which will either create a new request, or find the transfer to be done and call ide_set_inactive(), which clears s->bus->dma->aiocb. Therefore, the blk_drain() must wait for ide_trim_bh_cb() to run, which currently it will not always do. To fix this issue, we increment the BlockBackend's in-flight counter when the TRIM operation begins (in ide_issue_trim(), when the ide_trim_bh_cb() BH is created) and decrement it when ide_trim_bh_cb() is done. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2029980 Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com> --- v1: https://lists.nongnu.org/archive/html/qemu-block/2022-01/msg00024.html v2: - Increment BB’s in-flight counter while the BH is active so that blk_drain() will poll until the BH is done, as suggested by Paolo (No git-backport-diff, because this patch was basically completely rewritten, so it wouldn’t be worth it.) --- hw/ide/core.c | 7 +++++++ 1 file changed, 7 insertions(+)