Message ID | 1511196664-85304-3-git-send-email-anton.nefedov@virtuozzo.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon 20 Nov 2017 05:50:59 PM CET, Anton Nefedov wrote: > Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> > Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > --- > hw/ide/core.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/hw/ide/core.c b/hw/ide/core.c > index 471d0c9..2e4dea7 100644 > --- a/hw/ide/core.c > +++ b/hw/ide/core.c > @@ -389,6 +389,7 @@ typedef struct TrimAIOCB { > QEMUIOVector *qiov; > BlockAIOCB *aiocb; > int i, j; > + BlockAcctCookie acct; > } TrimAIOCB; > > static void trim_aio_cancel(BlockAIOCB *acb) > @@ -426,6 +427,14 @@ static void ide_trim_bh_cb(void *opaque) > static void ide_issue_trim_cb(void *opaque, int ret) > { > TrimAIOCB *iocb = opaque; > + if (iocb->i >= 0) { > + if (ret >= 0) { > + block_acct_done(blk_get_stats(iocb->blk), &iocb->acct); > + } else { > + block_acct_failed(blk_get_stats(iocb->blk), &iocb->acct); > + } > + } This part looks fine, but don't you also need to account for invalid requests (in ide_dma_cb() or somewhere else) ? Berto
On 5/12/2017 6:21 PM, Alberto Garcia wrote: > On Mon 20 Nov 2017 05:50:59 PM CET, Anton Nefedov wrote: >> Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> >> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> --- >> hw/ide/core.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/hw/ide/core.c b/hw/ide/core.c >> index 471d0c9..2e4dea7 100644 >> --- a/hw/ide/core.c >> +++ b/hw/ide/core.c >> @@ -389,6 +389,7 @@ typedef struct TrimAIOCB { >> QEMUIOVector *qiov; >> BlockAIOCB *aiocb; >> int i, j; >> + BlockAcctCookie acct; >> } TrimAIOCB; >> >> static void trim_aio_cancel(BlockAIOCB *acb) >> @@ -426,6 +427,14 @@ static void ide_trim_bh_cb(void *opaque) >> static void ide_issue_trim_cb(void *opaque, int ret) >> { >> TrimAIOCB *iocb = opaque; >> + if (iocb->i >= 0) { >> + if (ret >= 0) { >> + block_acct_done(blk_get_stats(iocb->blk), &iocb->acct); >> + } else { >> + block_acct_failed(blk_get_stats(iocb->blk), &iocb->acct); >> + } >> + } > > This part looks fine, but don't you also need to account for invalid > requests (in ide_dma_cb() or somewhere else) ? > > Berto > Good point; in fact, the TRIM sector range is never checked. (well it should be, down at the block layer, and then counted as error). The motivation was: commit d66168ed687325aa6d338ce3a3cff18ce3098ed6 Author: Michael Tokarev <mjt@tls.msk.ru> Date: Wed Aug 13 11:23:31 2014 +0400 ide: only constrain read/write requests to drive size, not other types Commit 58ac321135a introduced a check to ide dma processing which constrains all requests to drive size. However, apparently, some valid requests (like TRIM) does not fit in this constraint, and fails in 2.1. So check the range only for reads and writes. It seems like the removed check was at the wrong place (trim request has to be parsed first to get offset and nbytes). Probably it should be put to ide_issue_trim_cb() instead. cc John (should have done it earlier) /Anton
On 12/05/2017 12:14 PM, Anton Nefedov wrote: > > > On 5/12/2017 6:21 PM, Alberto Garcia wrote: >> On Mon 20 Nov 2017 05:50:59 PM CET, Anton Nefedov wrote: >>> Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> >>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>> --- >>> hw/ide/core.c | 12 ++++++++++++ >>> 1 file changed, 12 insertions(+) >>> >>> diff --git a/hw/ide/core.c b/hw/ide/core.c >>> index 471d0c9..2e4dea7 100644 >>> --- a/hw/ide/core.c >>> +++ b/hw/ide/core.c >>> @@ -389,6 +389,7 @@ typedef struct TrimAIOCB { >>> QEMUIOVector *qiov; >>> BlockAIOCB *aiocb; >>> int i, j; >>> + BlockAcctCookie acct; >>> } TrimAIOCB; >>> static void trim_aio_cancel(BlockAIOCB *acb) >>> @@ -426,6 +427,14 @@ static void ide_trim_bh_cb(void *opaque) >>> static void ide_issue_trim_cb(void *opaque, int ret) >>> { >>> TrimAIOCB *iocb = opaque; >>> + if (iocb->i >= 0) { >>> + if (ret >= 0) { >>> + block_acct_done(blk_get_stats(iocb->blk), &iocb->acct); >>> + } else { >>> + block_acct_failed(blk_get_stats(iocb->blk), &iocb->acct); >>> + } >>> + } >> >> This part looks fine, but don't you also need to account for invalid >> requests (in ide_dma_cb() or somewhere else) ? >> not ide_dma_cb this time, because the command does not use ATA registers as input, see below :( >> Berto >> > > Good point; in fact, the TRIM sector range is never checked. > (well it should be, down at the block layer, and then counted as error). > > The motivation was: > > commit d66168ed687325aa6d338ce3a3cff18ce3098ed6 > Author: Michael Tokarev <mjt@tls.msk.ru> > Date: Wed Aug 13 11:23:31 2014 +0400 > > ide: only constrain read/write requests to drive size, not other types > > Commit 58ac321135a introduced a check to ide dma processing which > constrains all requests to drive size. However, apparently, some > valid requests (like TRIM) does not fit in this constraint, and > fails in 2.1. So check the range only for reads and writes. > I wound up at the same commit. The problem here is that the TRIM command does not issue contiguous LBA+count requests in the same way using the ATA registers like DMA R/W functions do, but instead works a bit more like DMA commands in that it transmits a list of regions separately. Kevin pointed this out in 2014: https://lists.gnu.org/archive/html/qemu-devel/2014-08/msg02012.html "I can't give you a clear answer, but it all depends on the value of sector_num. This is the contents of the LBA registers and unused for TRIM (spec says it's reserved, so we shouldn't be looking at it)." He's right! This means the LBA/Count registers are undefined when we're using TRIM in ide_dma_cb. So, you have two things you can do: (1) Modify ide_sector_start_dma to start accounting for TRIM early, and then continue to mark it failed/complete where you do, but you'll need to add in a new error case in ide_issue_trim_cb to check the range and abort the job. We do not need to preprocess the entire "list" of TRIM regions upfront as we are within our rights to process some of them before aborting. (2) Continue to start accounting where you do, per-region, which keeps trim requests "per chunk", but add the error range checking almost immediately after, if this is useful for statistical purposes. It will look a little funny to start accounting and then immediately and synchronously mark it invalid, but that's probably the most accurate thing. I'm basing this off of the ATA8-AC3 spec, which defines the command "DATA SET MANAGEMENT - 06h, DMA": > If the Trim bit is set to one and: > a) the device detects an invalid LBA Range Entry; or > b) count is greater than IDENTIFY DEVICE data word 105 (see 7.16.7.55), > then the device shall return command aborted. > A device may trim one or more LBA Range Entries before it returns command aborted. See table 209. ATA8-ACS3 section seems pretty clear to me that we *may* abort the command if we detect an "invalid LBA Range Entry" which is defined in 3.1.40 as: "3.1.40 Invalid LBA Range: A range of LBAs that contains one or more invalid LBAs." which references 3.1.39: 3.1.39 Invalid LBA: An LBA that is greater than or equal to the largest value reported in IDENTIFY DEVICE data words 60..61 (see 7.16.7.22), IDENTIFY DEVICE data words 100..103 (see 7.16.7.53), or IDENTIFY DEVICE data words 230..233 (see 7.16.7.88). Of course, we only know if a range could possibly be invalid by the time we actually process it in ide_issue_trim_cb. > > It seems like the removed check was at the wrong place (trim request has > to be parsed first to get offset and nbytes). > Probably it should be put to ide_issue_trim_cb() instead. > > cc John (should have done it earlier) > Hi! > /Anton >
diff --git a/hw/ide/core.c b/hw/ide/core.c index 471d0c9..2e4dea7 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -389,6 +389,7 @@ typedef struct TrimAIOCB { QEMUIOVector *qiov; BlockAIOCB *aiocb; int i, j; + BlockAcctCookie acct; } TrimAIOCB; static void trim_aio_cancel(BlockAIOCB *acb) @@ -426,6 +427,14 @@ static void ide_trim_bh_cb(void *opaque) static void ide_issue_trim_cb(void *opaque, int ret) { TrimAIOCB *iocb = opaque; + if (iocb->i >= 0) { + if (ret >= 0) { + block_acct_done(blk_get_stats(iocb->blk), &iocb->acct); + } else { + block_acct_failed(blk_get_stats(iocb->blk), &iocb->acct); + } + } + if (ret >= 0) { while (iocb->j < iocb->qiov->niov) { int j = iocb->j; @@ -442,6 +451,9 @@ static void ide_issue_trim_cb(void *opaque, int ret) continue; } + block_acct_start(blk_get_stats(iocb->blk), &iocb->acct, + count << BDRV_SECTOR_BITS, BLOCK_ACCT_UNMAP); + /* Got an entry! Submit and exit. */ iocb->aiocb = blk_aio_pdiscard(iocb->blk, sector << BDRV_SECTOR_BITS,