Message ID | 1527493766-15487-1-git-send-email-shawn.lin@rock-chips.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, May 28, 2018 at 03:49:26PM +0800, Shawn Lin wrote: > card->erased_byte read from SCR(for SD cards) or EXT_CSD[181](for eMMC) > indicates whether the TRIM or ERASE make the erased data content zeros, > but DISCARD doesn't. Use the fact to implement REQ_OP_WRITE_ZEROES. I don't have the mmc spec in front of my, but the ATA/SCSI and NVMe equivalents all have the problem that if the device decideds to not deallocate the blocks it can leave the old data in place. Can you confirm that MMC gurantees that this is not the case? -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2018/5/28 15:57, Christoph Hellwig wrote: > On Mon, May 28, 2018 at 03:49:26PM +0800, Shawn Lin wrote: >> card->erased_byte read from SCR(for SD cards) or EXT_CSD[181](for eMMC) >> indicates whether the TRIM or ERASE make the erased data content zeros, >> but DISCARD doesn't. Use the fact to implement REQ_OP_WRITE_ZEROES. > > I don't have the mmc spec in front of my, but the ATA/SCSI and NVMe > equivalents all have the problem that if the device decideds to not > deallocate the blocks it can leave the old data in place. Can you > confirm that MMC gurantees that this is not the case? Yes. Quoting from JESD85-B51 spec for eMMC, section 6.6.10 TRIM, "The contents of a write block where the trim function has been applied shall be '0' or '1' depending on different memory technology. This value is defined in the EXT_CSD.". But 6.6.12 Discard says "The contents of a write block where the discard function has been applied shall be ‘don’t care’After discard operation, the original data may be remained partially or fully accessible to the host dependent on device. The portions of data that are no longer accessible by the host may be removed or unmapped just as in the case of TRIM. The device will decide the contents of discarded write block." Apart from the mandatory requirement in spec, I confirmed this with several device vendors as well, in the past. > > > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 28, 2018 at 04:05:38PM +0800, Shawn Lin wrote: > On 2018/5/28 15:57, Christoph Hellwig wrote: >> On Mon, May 28, 2018 at 03:49:26PM +0800, Shawn Lin wrote: >>> card->erased_byte read from SCR(for SD cards) or EXT_CSD[181](for eMMC) >>> indicates whether the TRIM or ERASE make the erased data content zeros, >>> but DISCARD doesn't. Use the fact to implement REQ_OP_WRITE_ZEROES. >> >> I don't have the mmc spec in front of my, but the ATA/SCSI and NVMe >> equivalents all have the problem that if the device decideds to not >> deallocate the blocks it can leave the old data in place. Can you >> confirm that MMC gurantees that this is not the case? > > Yes. Quoting from JESD85-B51 spec for eMMC, section 6.6.10 TRIM, > "The contents of a write block where the trim function has been applied > shall be '0' or '1' depending on different memory technology. This value > is defined in the EXT_CSD.". But 6.6.12 Discard says "The > contents of a write block where the discard function has been applied > shall be ‘don’t care’After discard operation, the original data may > be remained partially or fully accessible to the host dependent on > device. The portions of data that are no longer accessible by the host > may be removed or unmapped just as in the case of TRIM. The device will > decide the contents of discarded write block." > > Apart from the mandatory requirement in spec, I confirmed this with > several device vendors as well, in the past. So don't we need to check that the device actually supports the TRIM operation (that is call mmc_can_trim()) before setting max_write_zeroes_sectors? Also you probably need a function different from mmc_calc_max_discard to calculate max_write_zeroes_sectors given that the different methods seem to have different limits. -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2018/5/28 16:51, Christoph Hellwig wrote: > On Mon, May 28, 2018 at 04:05:38PM +0800, Shawn Lin wrote: >> On 2018/5/28 15:57, Christoph Hellwig wrote: >>> On Mon, May 28, 2018 at 03:49:26PM +0800, Shawn Lin wrote: >>>> card->erased_byte read from SCR(for SD cards) or EXT_CSD[181](for eMMC) >>>> indicates whether the TRIM or ERASE make the erased data content zeros, >>>> but DISCARD doesn't. Use the fact to implement REQ_OP_WRITE_ZEROES. >>> >>> I don't have the mmc spec in front of my, but the ATA/SCSI and NVMe >>> equivalents all have the problem that if the device decideds to not >>> deallocate the blocks it can leave the old data in place. Can you >>> confirm that MMC gurantees that this is not the case? >> >> Yes. Quoting from JESD85-B51 spec for eMMC, section 6.6.10 TRIM, >> "The contents of a write block where the trim function has been applied >> shall be '0' or '1' depending on different memory technology. This value >> is defined in the EXT_CSD.". But 6.6.12 Discard says "The >> contents of a write block where the discard function has been applied >> shall be ‘don’t care’After discard operation, the original data may >> be remained partially or fully accessible to the host dependent on >> device. The portions of data that are no longer accessible by the host >> may be removed or unmapped just as in the case of TRIM. The device will >> decide the contents of discarded write block." >> >> Apart from the mandatory requirement in spec, I confirmed this with >> several device vendors as well, in the past. > > So don't we need to check that the device actually supports the TRIM > operation (that is call mmc_can_trim()) before setting mmc_can_erase() could help, as it, ERASE command, also guarantees the data to be zeros after erased. And that checking is done before we set max_write_zeroes_sectors. drivers/mmc/core/queue.c 364 if (mmc_can_erase(card)) 365 mmc_queue_setup_discard(mq->queue, card); > max_write_zeroes_sectors? Also you probably need a function different > from mmc_calc_max_discard to calculate max_write_zeroes_sectors given > that the different methods seem to have different limits. At this point, we needn't. The limitation of ERASE/TRIM/DISCARD for mmc are the same. And mmc_calc_max_discard already do the right thing IMHO. > > > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Shawn, Christoph, On Monday 28 May 2018 01:19 PM, Shawn Lin wrote: > card->erased_byte read from SCR(for SD cards) or EXT_CSD[181](for eMMC) > indicates whether the TRIM or ERASE make the erased data content zeros, > but DISCARD doesn't. Use the fact to implement REQ_OP_WRITE_ZEROES. > > Cc: Faiz Abbas <faiz_abbas@ti.com> > Cc: Christoph Hellwig <hch@lst.de> > Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> > --- > Hi Faiz, > Would you like to test this patch to see if it could > solve your performance drop on the first write after > filesystem creation? > This patch doesn't fix the issue for me. I tested it on the latest master. I should get to further analyzing this by early next week. Log (notice first throughput lower than others): https://pastebin.ubuntu.com/p/NmZqzwfNjz/ Thanks, Faiz -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c index a0b9102..7d59887 100644 --- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -1132,7 +1132,14 @@ static void mmc_blk_issue_discard_rq(struct mmc_queue *mq, struct request *req) from = blk_rq_pos(req); nr = blk_rq_sectors(req); - if (mmc_can_discard(card)) + /* + * DISCARD says the contents of a write block where the discard + * function has been applied shold be 'don't care'. But TRIM and + * ERASE should have explicit '1' or '0' indicated by + * card->erased_byte. So REQ_OP_WRITE_ZEROES should use TRIM/ERASE + * instead. + */ + if (mmc_can_discard(card) && req_op(req) != REQ_OP_WRITE_ZEROES) arg = MMC_DISCARD_ARG; else if (mmc_can_trim(card)) arg = MMC_TRIM_ARG; @@ -2228,6 +2235,7 @@ enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req) mmc_blk_issue_drv_op(mq, req); break; case REQ_OP_DISCARD: + case REQ_OP_WRITE_ZEROES: mmc_blk_issue_discard_rq(mq, req); break; case REQ_OP_SECURE_ERASE: diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c index 56e9a80..2657a46 100644 --- a/drivers/mmc/core/queue.c +++ b/drivers/mmc/core/queue.c @@ -51,6 +51,7 @@ static enum mmc_issue_type mmc_cqe_issue_type(struct mmc_host *host, case REQ_OP_DRV_OUT: case REQ_OP_DISCARD: case REQ_OP_SECURE_ERASE: + case REQ_OP_WRITE_ZEROES: return MMC_ISSUE_SYNC; case REQ_OP_FLUSH: return mmc_cqe_can_dcmd(host) ? MMC_ISSUE_DCMD : MMC_ISSUE_SYNC; @@ -193,6 +194,8 @@ static void mmc_queue_setup_discard(struct request_queue *q, q->limits.discard_granularity = 0; if (mmc_can_secure_erase_trim(card)) blk_queue_flag_set(QUEUE_FLAG_SECERASE, q); + if (!card->erased_byte) + q->limits.max_write_zeroes_sectors = max_discard; } /**
card->erased_byte read from SCR(for SD cards) or EXT_CSD[181](for eMMC) indicates whether the TRIM or ERASE make the erased data content zeros, but DISCARD doesn't. Use the fact to implement REQ_OP_WRITE_ZEROES. Cc: Faiz Abbas <faiz_abbas@ti.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> --- Hi Faiz, Would you like to test this patch to see if it could solve your performance drop on the first write after filesystem creation? drivers/mmc/core/block.c | 10 +++++++++- drivers/mmc/core/queue.c | 3 +++ 2 files changed, 12 insertions(+), 1 deletion(-)