diff mbox

[10/18] lightnvm: pblk: use bio_copy_kern when possible

Message ID 1504695071-25928-11-git-send-email-javier@cnexlabs.com (mailing list archive)
State New, archived
Headers show

Commit Message

=?UTF-8?q?Javier=20Gonz=C3=A1lez?= Sept. 6, 2017, 10:51 a.m. UTC
In pblk, buffers forming bios can be allocated on physically contiguous
or virtually contiguous memory. For physically contiguous memory, we
already use the bio_map_kern helper funciton, however, for virtually
contiguous memory, we from the bio manually. This makes the code more
complex, specially on the completion path, where mapped pages need to be
freed.

Instead, use bio_copy_kern, which does the same and at the same time
simplifies the completion path.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
---
 drivers/lightnvm/pblk-core.c     | 39 ++++-----------------------------------
 drivers/lightnvm/pblk-read.c     |  3 +--
 drivers/lightnvm/pblk-recovery.c |  3 +--
 drivers/lightnvm/pblk-write.c    |  7 +------
 drivers/lightnvm/pblk.h          |  2 +-
 5 files changed, 8 insertions(+), 46 deletions(-)

Comments

Christoph Hellwig Sept. 6, 2017, 1:47 p.m. UTC | #1
On Wed, Sep 06, 2017 at 12:51:03PM +0200, Javier González wrote:
> In pblk, buffers forming bios can be allocated on physically contiguous
> or virtually contiguous memory. For physically contiguous memory, we
> already use the bio_map_kern helper funciton, however, for virtually
> contiguous memory, we from the bio manually. This makes the code more
> complex, specially on the completion path, where mapped pages need to be
> freed.
> 
> Instead, use bio_copy_kern, which does the same and at the same time
> simplifies the completion path.

Nope.  You want to loop over vmalloc_to_page and call bio_add_page
for each page, after taking care of virtually tagged caches instead
of this bounce buffering.

And you really want to allocate the request first and only then map
the data to the request, as said before.
=?UTF-8?q?Javier=20Gonz=C3=A1lez?= Sept. 6, 2017, 2 p.m. UTC | #2
> On 6 Sep 2017, at 15.47, Christoph Hellwig <hch@infradead.org> wrote:
> 
> On Wed, Sep 06, 2017 at 12:51:03PM +0200, Javier González wrote:
>> In pblk, buffers forming bios can be allocated on physically contiguous
>> or virtually contiguous memory. For physically contiguous memory, we
>> already use the bio_map_kern helper funciton, however, for virtually
>> contiguous memory, we from the bio manually. This makes the code more
>> complex, specially on the completion path, where mapped pages need to be
>> freed.
>> 
>> Instead, use bio_copy_kern, which does the same and at the same time
>> simplifies the completion path.
> 
> Nope.  You want to loop over vmalloc_to_page and call bio_add_page
> for each page,

Yes. This is basically what I did before.

> after taking care of virtually tagged caches instead
> of this bounce buffering.

And thus I considered bio_copy_kern to be a better solution, since it
will through time take care of doing the vmalloc_to_page correctly for
all cases.

> 
> And you really want to allocate the request first and only then map
> the data to the request, as said before.

Ok. So this would mean that targets (e.g., pblk) deal with struct
request instead of only dealing with bios and then letting the LightNVM
core transforming bios to requests. This way we can directly map to the
request. Is this what you mean?

Just out of curiosity, why is forming the bio trough bio_copy_kern (or
manually doing the same) and then transforming to a request incorrect /
worse?

Javier
Christoph Hellwig Sept. 7, 2017, 11:08 a.m. UTC | #3
On Wed, Sep 06, 2017 at 04:00:56PM +0200, Javier González wrote:
> > Nope.  You want to loop over vmalloc_to_page and call bio_add_page
> > for each page,
> 
> Yes. This is basically what I did before.
> 
> > after taking care of virtually tagged caches instead
> > of this bounce buffering.
> 
> And thus I considered bio_copy_kern to be a better solution, since it
> will through time take care of doing the vmalloc_to_page correctly for
> all cases.

bio_copy_kern copies all the data, so it is generally not a good
idea.  The cache flushing isn't too hard - take a look at the XFS
buffer cache for an existing version.

It would be good to just to do the right thing inside bio_map_kern
for that so that callers don't need to care if it is vmalloced or
not.

> Ok. So this would mean that targets (e.g., pblk) deal with struct
> request instead of only dealing with bios and then letting the LightNVM
> core transforming bios to requests. This way we can directly map to the
> request. Is this what you mean?

Yes.

> Just out of curiosity, why is forming the bio trough bio_copy_kern (or
> manually doing the same) and then transforming to a request incorrect /
> worse?

Because you expose yourself to the details of mapping a bio to request.
We had to export blk_init_request_from_bio just for lightnvm to do this,
and it also has to do weird other bits about requests.  If you go
through blk_rq_map_* the block layer takes care of all that for you.
=?UTF-8?q?Javier=20Gonz=C3=A1lez?= Sept. 7, 2017, 11:20 a.m. UTC | #4
> On 7 Sep 2017, at 13.08, Christoph Hellwig <hch@infradead.org> wrote:
> 
> On Wed, Sep 06, 2017 at 04:00:56PM +0200, Javier González wrote:
>>> Nope.  You want to loop over vmalloc_to_page and call bio_add_page
>>> for each page,
>> 
>> Yes. This is basically what I did before.
>> 
>>> after taking care of virtually tagged caches instead
>>> of this bounce buffering.
>> 
>> And thus I considered bio_copy_kern to be a better solution, since it
>> will through time take care of doing the vmalloc_to_page correctly for
>> all cases.
> 
> bio_copy_kern copies all the data, so it is generally not a good
> idea.  The cache flushing isn't too hard - take a look at the XFS
> buffer cache for an existing version.
> 
> It would be good to just to do the right thing inside bio_map_kern
> for that so that callers don't need to care if it is vmalloced or
> not.

Yes. That would help. I know md also needs to manually add pages on
vmalloced memory. Probably other do too.

> 
>> Ok. So this would mean that targets (e.g., pblk) deal with struct
>> request instead of only dealing with bios and then letting the LightNVM
>> core transforming bios to requests. This way we can directly map to the
>> request. Is this what you mean?
> 
> Yes.
> 
>> Just out of curiosity, why is forming the bio trough bio_copy_kern (or
>> manually doing the same) and then transforming to a request incorrect /
>> worse?
> 
> Because you expose yourself to the details of mapping a bio to request.
> We had to export blk_init_request_from_bio just for lightnvm to do this,
> and it also has to do weird other bits about requests.  If you go
> through blk_rq_map_* the block layer takes care of all that for you.

Ok. It makes sense. I'll talk to Matias about it.


Thanks!
Javier
diff mbox

Patch

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index f6b9afbe8589..e69e8829b093 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -423,42 +423,14 @@  int pblk_submit_io(struct pblk *pblk, struct nvm_rq *rqd)
 
 struct bio *pblk_bio_map_addr(struct pblk *pblk, void *data,
 			      unsigned int nr_secs, unsigned int len,
-			      int alloc_type, gfp_t gfp_mask)
+			      int alloc_type, gfp_t gfp_mask, int reading)
 {
 	struct nvm_tgt_dev *dev = pblk->dev;
-	void *kaddr = data;
-	struct page *page;
-	struct bio *bio;
-	int i, ret;
 
 	if (alloc_type == PBLK_KMALLOC_META)
-		return bio_map_kern(dev->q, kaddr, len, gfp_mask);
+		return bio_map_kern(dev->q, data, len, gfp_mask);
 
-	bio = bio_kmalloc(gfp_mask, nr_secs);
-	if (!bio)
-		return ERR_PTR(-ENOMEM);
-
-	for (i = 0; i < nr_secs; i++) {
-		page = vmalloc_to_page(kaddr);
-		if (!page) {
-			pr_err("pblk: could not map vmalloc bio\n");
-			bio_put(bio);
-			bio = ERR_PTR(-ENOMEM);
-			goto out;
-		}
-
-		ret = bio_add_pc_page(dev->q, bio, page, PAGE_SIZE, 0);
-		if (ret != PAGE_SIZE) {
-			pr_err("pblk: could not add page to bio\n");
-			bio_put(bio);
-			bio = ERR_PTR(-ENOMEM);
-			goto out;
-		}
-
-		kaddr += PAGE_SIZE;
-	}
-out:
-	return bio;
+	return bio_copy_kern(dev->q, data, len, GFP_KERNEL, reading);
 }
 
 int pblk_calc_secs(struct pblk *pblk, unsigned long secs_avail,
@@ -588,7 +560,7 @@  static int pblk_line_submit_emeta_io(struct pblk *pblk, struct pblk_line *line,
 	rq_len = rq_ppas * geo->sec_size;
 
 	bio = pblk_bio_map_addr(pblk, emeta_buf, rq_ppas, rq_len,
-					l_mg->emeta_alloc_type, GFP_KERNEL);
+			l_mg->emeta_alloc_type, GFP_KERNEL, dir == READ);
 	if (IS_ERR(bio)) {
 		ret = PTR_ERR(bio);
 		goto free_rqd_dma;
@@ -673,9 +645,6 @@  static int pblk_line_submit_emeta_io(struct pblk *pblk, struct pblk_line *line,
 	atomic_dec(&pblk->inflight_io);
 	reinit_completion(&wait);
 
-	if (likely(pblk->l_mg.emeta_alloc_type == PBLK_VMALLOC_META))
-		bio_put(bio);
-
 	if (rqd.error) {
 		if (dir == WRITE)
 			pblk_log_write_err(pblk, &rqd);
diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
index f32091503547..1be972521dcd 100644
--- a/drivers/lightnvm/pblk-read.c
+++ b/drivers/lightnvm/pblk-read.c
@@ -542,7 +542,7 @@  int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq)
 
 	data_len = (gc_rq->secs_to_gc) * geo->sec_size;
 	bio = pblk_bio_map_addr(pblk, gc_rq->data, gc_rq->secs_to_gc, data_len,
-						PBLK_VMALLOC_META, GFP_KERNEL);
+					PBLK_VMALLOC_META, GFP_KERNEL, 1);
 	if (IS_ERR(bio)) {
 		pr_err("pblk: could not allocate GC bio (%lu)\n", PTR_ERR(bio));
 		goto err_free_dma;
@@ -583,7 +583,6 @@  int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq)
 	atomic_long_sub(gc_rq->secs_to_gc, &pblk->inflight_reads);
 #endif
 
-	bio_put(bio);
 out:
 	nvm_dev_dma_free(dev->parent, rqd.meta_list, rqd.dma_meta_list);
 	return ret;
diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index 279638309d9a..b033d4f2b446 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -342,7 +342,6 @@  static void pblk_end_io_recov(struct nvm_rq *rqd)
 
 	pblk_up_page(pblk, rqd->ppa_list, rqd->nr_ppas);
 
-	bio_put(rqd->bio);
 	nvm_dev_dma_free(dev->parent, rqd->meta_list, rqd->dma_meta_list);
 	pblk_free_rqd(pblk, rqd, WRITE_INT);
 
@@ -411,7 +410,7 @@  static int pblk_recov_pad_oob(struct pblk *pblk, struct pblk_line *line,
 	}
 
 	bio = pblk_bio_map_addr(pblk, data, rq_ppas, rq_len,
-						PBLK_VMALLOC_META, GFP_KERNEL);
+					PBLK_VMALLOC_META, GFP_KERNEL, 0);
 	if (IS_ERR(bio)) {
 		ret = PTR_ERR(bio);
 		goto fail_free_rqd;
diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
index 920b93fb15d5..e5d77af1aad1 100644
--- a/drivers/lightnvm/pblk-write.c
+++ b/drivers/lightnvm/pblk-write.c
@@ -187,17 +187,12 @@  static void pblk_end_io_write_meta(struct nvm_rq *rqd)
 		pblk_log_write_err(pblk, rqd);
 		pr_err("pblk: metadata I/O failed. Line %d\n", line->id);
 	}
-#ifdef CONFIG_NVM_DEBUG
-	else
-		WARN_ONCE(rqd->bio->bi_status, "pblk: corrupted write error\n");
-#endif
 
 	sync = atomic_add_return(rqd->nr_ppas, &emeta->sync);
 	if (sync == emeta->nr_entries)
 		pblk_line_run_ws(pblk, line, NULL, pblk_line_close_ws,
 								pblk->close_wq);
 
-	bio_put(rqd->bio);
 	nvm_dev_dma_free(dev->parent, rqd->meta_list, rqd->dma_meta_list);
 	pblk_free_rqd(pblk, rqd, WRITE_INT);
 
@@ -382,7 +377,7 @@  int pblk_submit_meta_io(struct pblk *pblk, struct pblk_line *meta_line)
 	data = ((void *)emeta->buf) + emeta->mem;
 
 	bio = pblk_bio_map_addr(pblk, data, rq_ppas, rq_len,
-					l_mg->emeta_alloc_type, GFP_KERNEL);
+				l_mg->emeta_alloc_type, GFP_KERNEL, 0);
 	if (IS_ERR(bio)) {
 		ret = PTR_ERR(bio);
 		goto fail_free_rqd;
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 3771f876ce80..2304a4891f20 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -712,7 +712,7 @@  int pblk_submit_io(struct pblk *pblk, struct nvm_rq *rqd);
 int pblk_submit_meta_io(struct pblk *pblk, struct pblk_line *meta_line);
 struct bio *pblk_bio_map_addr(struct pblk *pblk, void *data,
 			      unsigned int nr_secs, unsigned int len,
-			      int alloc_type, gfp_t gfp_mask);
+			      int alloc_type, gfp_t gfp_mask, int reading);
 struct pblk_line *pblk_line_get(struct pblk *pblk);
 struct pblk_line *pblk_line_get_first_data(struct pblk *pblk);
 void pblk_line_replace_data(struct pblk *pblk);