Message ID | 20180105131621.20808-25-m@bjorling.me (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Jan 05 2018, Matias Bjørling wrote: > From: Javier González <javier@cnexlabs.com> > > Since pblk registers its own block device, the iostat accounting is > not automatically done for us. Therefore, add the necessary > accounting logic to satisfy the iostat interface. Ignorant question - why is it a raw block device, not using blk-mq? > @@ -193,9 +197,9 @@ static void pblk_end_io_read(struct nvm_rq *rqd) > __pblk_end_io_read(pblk, rqd, true); > } > > -static int pblk_fill_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd, > - unsigned int bio_init_idx, > - unsigned long *read_bitmap) > +static int pblk_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd, > + unsigned int bio_init_idx, > + unsigned long *read_bitmap) > { > struct bio *new_bio, *bio = rqd->bio; > struct pblk_sec_meta *meta_list = rqd->meta_list; > @@ -306,6 +310,8 @@ static int pblk_fill_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd, > return NVM_IO_OK; > > err: > + pr_err("pblk: failed to perform partial read\n"); > + > /* Free allocated pages in new bio */ > pblk_bio_free_pages(pblk, bio, 0, new_bio->bi_vcnt); > __pblk_end_io_read(pblk, rqd, false); This seems to include unrelated changes, like the rename above and the addition of the error logging?
On 01/05/2018 04:42 PM, Jens Axboe wrote: > On Fri, Jan 05 2018, Matias Bjørling wrote: >> From: Javier González <javier@cnexlabs.com> >> >> Since pblk registers its own block device, the iostat accounting is >> not automatically done for us. Therefore, add the necessary >> accounting logic to satisfy the iostat interface. > > Ignorant question - why is it a raw block device, not using blk-mq? The current flow is using the raw block device, together with the blk-mq nvme device driver. A bio is sent down to the nvme_nvm_submit_io() path in the /drivers/nvme/host/lightnvm.c file. From there it attaches the to NVMe blk-mq implementation. Is there a better way to do it? > >> @@ -193,9 +197,9 @@ static void pblk_end_io_read(struct nvm_rq *rqd) >> __pblk_end_io_read(pblk, rqd, true); >> } >> >> -static int pblk_fill_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd, >> - unsigned int bio_init_idx, >> - unsigned long *read_bitmap) >> +static int pblk_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd, >> + unsigned int bio_init_idx, >> + unsigned long *read_bitmap) >> { >> struct bio *new_bio, *bio = rqd->bio; >> struct pblk_sec_meta *meta_list = rqd->meta_list; >> @@ -306,6 +310,8 @@ static int pblk_fill_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd, >> return NVM_IO_OK; >> >> err: >> + pr_err("pblk: failed to perform partial read\n"); >> + >> /* Free allocated pages in new bio */ >> pblk_bio_free_pages(pblk, bio, 0, new_bio->bi_vcnt); >> __pblk_end_io_read(pblk, rqd, false); > > This seems to include unrelated changes, like the rename above and the > addition of the error logging? > Grah... I missed it during review. It should have been its own patch or part of the early refactor patches. Thanks for picking it up.
On Fri, Jan 05, 2018 at 07:33:36PM +0100, Matias Bjørling wrote: > On 01/05/2018 04:42 PM, Jens Axboe wrote: > > On Fri, Jan 05 2018, Matias Bjørling wrote: > > > From: Javier González <javier@cnexlabs.com> > > > > > > Since pblk registers its own block device, the iostat accounting is > > > not automatically done for us. Therefore, add the necessary > > > accounting logic to satisfy the iostat interface. > > > > Ignorant question - why is it a raw block device, not using blk-mq? > > The current flow is using the raw block device, together with the blk-mq > nvme device driver. A bio is sent down to the nvme_nvm_submit_io() path in > the /drivers/nvme/host/lightnvm.c file. From there it attaches the to NVMe > blk-mq implementation. > > Is there a better way to do it? I suspect the right way to do things is to split NVMe for different I/O command sets, and make this an I/O command set. But before touching much of NVMe, I'd really, really like to see an actual spec first.
> On 8 Jan 2018, at 12.54, Christoph Hellwig <hch@infradead.org> wrote: > > On Fri, Jan 05, 2018 at 07:33:36PM +0100, Matias Bjørling wrote: >> On 01/05/2018 04:42 PM, Jens Axboe wrote: >>> On Fri, Jan 05 2018, Matias Bjørling wrote: >>>> From: Javier González <javier@cnexlabs.com> >>>> >>>> Since pblk registers its own block device, the iostat accounting is >>>> not automatically done for us. Therefore, add the necessary >>>> accounting logic to satisfy the iostat interface. >>> >>> Ignorant question - why is it a raw block device, not using blk-mq? >> >> The current flow is using the raw block device, together with the blk-mq >> nvme device driver. A bio is sent down to the nvme_nvm_submit_io() path in >> the /drivers/nvme/host/lightnvm.c file. From there it attaches the to NVMe >> blk-mq implementation. >> >> Is there a better way to do it? > > I suspect the right way to do things is to split NVMe for different > I/O command sets, and make this an I/O command set. This makes sense. This was actually how I implemented it to start with, but I changed it to be less intrusive on the nvme path. Let's revert the patch and we can add it back when we push the 2.0 patches. > But before touching much of NVMe, I'd really, really like to see an > actual spec first. The 2.0 spec. is open and is available here [1]. I thought you had looked into it already... Anyway, feedback is more than welcome. [1] https://docs.google.com/document/d/1kedBY_1-hfkAlqT4EdwY6gz-6UOZbn7kIjWpmBLPNj0 Javier
On Mon, Jan 8, 2018 at 1:53 PM, Javier González <jg@lightnvm.io> wrote: >> On 8 Jan 2018, at 12.54, Christoph Hellwig <hch@infradead.org> wrote: >> >> On Fri, Jan 05, 2018 at 07:33:36PM +0100, Matias Bjørling wrote: >>> On 01/05/2018 04:42 PM, Jens Axboe wrote: >>>> On Fri, Jan 05 2018, Matias Bjørling wrote: >>>>> From: Javier González <javier@cnexlabs.com> >>>>> >>>>> Since pblk registers its own block device, the iostat accounting is >>>>> not automatically done for us. Therefore, add the necessary >>>>> accounting logic to satisfy the iostat interface. >>>> >>>> Ignorant question - why is it a raw block device, not using blk-mq? >>> >>> The current flow is using the raw block device, together with the blk-mq >>> nvme device driver. A bio is sent down to the nvme_nvm_submit_io() path in >>> the /drivers/nvme/host/lightnvm.c file. From there it attaches the to NVMe >>> blk-mq implementation. >>> >>> Is there a better way to do it? >> >> I suspect the right way to do things is to split NVMe for different >> I/O command sets, and make this an I/O command set. > > This makes sense. This was actually how I implemented it to start with, > but I changed it to be less intrusive on the nvme path. Let's revert the > patch and we can add it back when we push the 2.0 patches. > >> But before touching much of NVMe, I'd really, really like to see an >> actual spec first. > > The 2.0 spec. is open and is available here [1]. I thought you had > looked into it already... Anyway, feedback is more than welcome. > > [1] https://docs.google.com/document/d/1kedBY_1-hfkAlqT4EdwY6gz-6UOZbn7kIjWpmBLPNj0 > > Javier The 2.0 spec is still under development. No reason to redo the I/O stacks until it is final.
diff --git a/drivers/lightnvm/pblk-cache.c b/drivers/lightnvm/pblk-cache.c index 0d227ef..000fcad 100644 --- a/drivers/lightnvm/pblk-cache.c +++ b/drivers/lightnvm/pblk-cache.c @@ -19,12 +19,16 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags) { + struct request_queue *q = pblk->dev->q; struct pblk_w_ctx w_ctx; sector_t lba = pblk_get_lba(bio); + unsigned long start_time = jiffies; unsigned int bpos, pos; int nr_entries = pblk_get_secs(bio); int i, ret; + generic_start_io_acct(q, WRITE, bio_sectors(bio), &pblk->disk->part0); + /* Update the write buffer head (mem) with the entries that we can * write. The write in itself cannot fail, so there is no need to * rollback from here on. @@ -67,6 +71,7 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags) pblk_rl_inserted(&pblk->rl, nr_entries); out: + generic_end_io_acct(q, WRITE, &pblk->disk->part0, start_time); pblk_write_should_kick(pblk); return ret; } diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c index 0fe0c04..2f76128 100644 --- a/drivers/lightnvm/pblk-read.c +++ b/drivers/lightnvm/pblk-read.c @@ -158,8 +158,12 @@ static void pblk_end_user_read(struct bio *bio) static void __pblk_end_io_read(struct pblk *pblk, struct nvm_rq *rqd, bool put_line) { + struct nvm_tgt_dev *dev = pblk->dev; struct pblk_g_ctx *r_ctx = nvm_rq_to_pdu(rqd); struct bio *bio = rqd->bio; + unsigned long start_time = r_ctx->start_time; + + generic_end_io_acct(dev->q, READ, &pblk->disk->part0, start_time); if (rqd->error) pblk_log_read_err(pblk, rqd); @@ -193,9 +197,9 @@ static void pblk_end_io_read(struct nvm_rq *rqd) __pblk_end_io_read(pblk, rqd, true); } -static int pblk_fill_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd, - unsigned int bio_init_idx, - unsigned long *read_bitmap) +static int pblk_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd, + unsigned int bio_init_idx, + unsigned long *read_bitmap) { struct bio *new_bio, *bio = rqd->bio; struct pblk_sec_meta *meta_list = rqd->meta_list; @@ -306,6 +310,8 @@ static int pblk_fill_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd, return NVM_IO_OK; err: + pr_err("pblk: failed to perform partial read\n"); + /* Free allocated pages in new bio */ pblk_bio_free_pages(pblk, bio, 0, new_bio->bi_vcnt); __pblk_end_io_read(pblk, rqd, false); @@ -357,6 +363,7 @@ static void pblk_read_rq(struct pblk *pblk, struct nvm_rq *rqd, int pblk_submit_read(struct pblk *pblk, struct bio *bio) { struct nvm_tgt_dev *dev = pblk->dev; + struct request_queue *q = dev->q; sector_t blba = pblk_get_lba(bio); unsigned int nr_secs = pblk_get_secs(bio); struct pblk_g_ctx *r_ctx; @@ -372,6 +379,8 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio) return NVM_IO_ERR; } + generic_start_io_acct(q, READ, bio_sectors(bio), &pblk->disk->part0); + bitmap_zero(&read_bitmap, nr_secs); rqd = pblk_alloc_rqd(pblk, PBLK_READ); @@ -383,6 +392,7 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio) rqd->end_io = pblk_end_io_read; r_ctx = nvm_rq_to_pdu(rqd); + r_ctx->start_time = jiffies; r_ctx->lba = blba; /* Save the index for this bio's start. This is needed in case @@ -422,7 +432,7 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio) int_bio = bio_clone_fast(bio, GFP_KERNEL, pblk_bio_set); if (!int_bio) { pr_err("pblk: could not clone read bio\n"); - return NVM_IO_ERR; + goto fail_end_io; } rqd->bio = int_bio; @@ -433,7 +443,7 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio) pr_err("pblk: read IO submission failed\n"); if (int_bio) bio_put(int_bio); - return ret; + goto fail_end_io; } return NVM_IO_OK; @@ -442,17 +452,14 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio) /* The read bio request could be partially filled by the write buffer, * but there are some holes that need to be read from the drive. */ - ret = pblk_fill_partial_read_bio(pblk, rqd, bio_init_idx, &read_bitmap); - if (ret) { - pr_err("pblk: failed to perform partial read\n"); - return ret; - } - - return NVM_IO_OK; + return pblk_partial_read_bio(pblk, rqd, bio_init_idx, &read_bitmap); fail_rqd_free: pblk_free_rqd(pblk, rqd, PBLK_READ); return ret; +fail_end_io: + __pblk_end_io_read(pblk, rqd, false); + return ret; } static int read_ppalist_rq_gc(struct pblk *pblk, struct nvm_rq *rqd, diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h index 93ec4fd..8af374e 100644 --- a/drivers/lightnvm/pblk.h +++ b/drivers/lightnvm/pblk.h @@ -113,6 +113,7 @@ struct pblk_c_ctx { /* read context */ struct pblk_g_ctx { void *private; + unsigned long start_time; u64 lba; };