diff mbox

SQUASHME: Streamline pmem.c

Message ID 55143B99.7060407@plexistor.com (mailing list archive)
State New, archived
Headers show

Commit Message

Boaz Harrosh March 26, 2015, 5:02 p.m. UTC
Christoph why did you choose the fat and ugly version of
pmem.c beats me. Anyway, here are the cleanups you need on
top of your pmem patch.

Among other it does:
* Remove getgeo. It is not needed for modern fdisk and was never
  needed for libgparted and cfdisk.

* remove 89 lines of code to do a single memcpy. The reason
  this was so in brd (done badly BTW) is because destination
  memory is page-by-page based. With pmem we have the destination
  contiguous so we can do any size, in one go.

* Remove SECTOR_SHIFT. It is defined in 6 other places
  in the Kernel. I do not like a new one. 9 is used through
  out, including block core. I do not like pmem to blasphemy
  more than needed.

* More style stuff ...

Please squash into your initial submission

Signed-off-by: Boaz Harrosh <boaz@plexistor.com>
---
 drivers/block/pmem.c | 137 +++++++++++----------------------------------------
 1 file changed, 28 insertions(+), 109 deletions(-)

Comments

Christoph Hellwig March 26, 2015, 5:23 p.m. UTC | #1
On Thu, Mar 26, 2015 at 07:02:17PM +0200, Boaz Harrosh wrote:
> Christoph why did you choose the fat and ugly version of
> pmem.c beats me. Anyway, here are the cleanups you need on
> top of your pmem patch.
> 
> Among other it does:
> * Remove getgeo. It is not needed for modern fdisk and was never
>   needed for libgparted and cfdisk.
> 
> * remove 89 lines of code to do a single memcpy. The reason
>   this was so in brd (done badly BTW) is because destination
>   memory is page-by-page based. With pmem we have the destination
>   contiguous so we can do any size, in one go.
> 
> * Remove SECTOR_SHIFT. It is defined in 6 other places
>   in the Kernel. I do not like a new one. 9 is used through
>   out, including block core. I do not like pmem to blasphemy
>   more than needed.
> 
> * More style stuff ...

One patch per items please..

> - * This driver is heavily based on drivers/block/brd.c.
> + * This driver's skeleton is based on drivers/block/brd.c.
>   * Copyright (C) 2007 Nick Piggin
>   * Copyright (C) 2007 Novell Inc.

Looks like there is basically nothing left of brd.c after this patch,
so we might as well drop this.

> -/*
> - * direct translation from (pmem,sector) => void*
> - * We do not require that sector be page aligned.
> - * The return value will point to the beginning of the page containing the
> - * given sector, not to the sector itself.
> - */

not quite related to you patch:  all the pmem and direct_access code uses
normal kernel address pointers, but we're actually dealing with iomem
here which makes sparse a little unhappy..

> -	BUG_ON(bio->bi_rw & REQ_DISCARD);
> +	if (WARN_ON(bio->bi_rw & REQ_DISCARD)) {
> +		err = -EINVAL;
> +		goto out;
> +	}

No need to write additional code here, I'd rather remove it entirely
if the BUG_ON bothers you.  There is no way we'll get a discard without
the driver asking for it.  And then you'd have to check for all the
other non-standard I/O types as well.

> +		/* NOTE: There is a legend saying that bv_len might be
> +		 * bigger than PAGE_SIZE in the case that bv_page points to
> +		 * a physical contiguous PFN set. But for us it is fine because
> +		 * it means the Kernel virtual mapping is also contiguous. And
> +		 * on the pmem side we are always contiguous both virtual and
> +		 * physical
> +		 */

Linux comment style has the opening "/*" on it's own line.  And talking
about legends in comments isn't a very nice style either.
Ross Zwisler March 26, 2015, 10:17 p.m. UTC | #2
On Thu, 2015-03-26 at 19:02 +0200, Boaz Harrosh wrote:
> Christoph why did you choose the fat and ugly version of
> pmem.c beats me. Anyway, here are the cleanups you need on
> top of your pmem patch.
> 
> Among other it does:
> * Remove getgeo. It is not needed for modern fdisk and was never
>   needed for libgparted and cfdisk.
> 
> * remove 89 lines of code to do a single memcpy. The reason
>   this was so in brd (done badly BTW) is because destination
>   memory is page-by-page based. With pmem we have the destination
>   contiguous so we can do any size, in one go.
> 
> * Remove SECTOR_SHIFT. It is defined in 6 other places
>   in the Kernel. I do not like a new one. 9 is used through
>   out, including block core. I do not like pmem to blasphemy
>   more than needed.
> 
> * More style stuff ...
> 
> Please squash into your initial submission
> 
> Signed-off-by: Boaz Harrosh <boaz@plexistor.com>

I agree with Christoph's comments, but overall I think these changes are
great.  Please send out as a series & you can add:

Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Ross Zwisler March 26, 2015, 10:22 p.m. UTC | #3
On Thu, 2015-03-26 at 19:02 +0200, Boaz Harrosh wrote:
>  static void pmem_do_bvec(struct pmem_device *pmem, struct page *page,
>  			unsigned int len, unsigned int off, int rw,
>  			sector_t sector)
>  {
>  	void *mem = kmap_atomic(page);
> +	size_t pmem_off = sector << 9;
> +
> +	BUG_ON(pmem_off >= pmem->size);

This check should take 'len' into account so we don't copy off the end of our
PMEM space.

We should also just return -EIO back up to pmem_make_request() and have that
fail the bio, as opposed to doing the drastic BUG_ON.
Dan Williams March 26, 2015, 11:31 p.m. UTC | #4
On Thu, Mar 26, 2015 at 10:02 AM, Boaz Harrosh <boaz@plexistor.com> wrote:
>
> Christoph why did you choose the fat and ugly version of
> pmem.c beats me.

Boaz, I am so very tired of your snide commentary.  It severely
detracts from the technical merit of your patches.  Please stop.
Boaz Harrosh March 31, 2015, 1:44 p.m. UTC | #5
On 03/27/2015 01:31 AM, Dan Williams wrote:
> On Thu, Mar 26, 2015 at 10:02 AM, Boaz Harrosh <boaz@plexistor.com> wrote:
>>
>> Christoph why did you choose the fat and ugly version of
>> pmem.c beats me.
> 
> Boaz, I am so very tired of your snide commentary.  It severely
> detracts from the technical merit of your patches.  Please stop.
> 

Hi Dan

snide (sn?d)
adj. snid·er, snid·est
 1. Mocking or derogatory in a malicious or ironic way

Please do not take me seriously. I'm just a joke ;-)
There is no "malicious nor Mocking" in any of my words Yes maybe some
"irony"

I think "severely detracts" is a bit of an exaggeration, no?

All I really really care about is like you, the "technical merit"

Thanks
Boaz
diff mbox

Patch

diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
index 545b13b..5a57a06 100644
--- a/drivers/block/pmem.c
+++ b/drivers/block/pmem.c
@@ -11,7 +11,7 @@ 
  * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
  * more details.
  *
- * This driver is heavily based on drivers/block/brd.c.
+ * This driver's skeleton is based on drivers/block/brd.c.
  * Copyright (C) 2007 Nick Piggin
  * Copyright (C) 2007 Novell Inc.
  */
@@ -24,11 +24,6 @@ 
 #include <linux/module.h>
 #include <linux/moduleparam.h>
 #include <linux/slab.h>
-
-#define SECTOR_SHIFT		9
-#define PAGE_SECTORS_SHIFT	(PAGE_SHIFT - SECTOR_SHIFT)
-#define PAGE_SECTORS		(1 << PAGE_SECTORS_SHIFT)
-
 #define PMEM_MINORS		16
 
 struct pmem_device {
@@ -44,100 +39,17 @@  struct pmem_device {
 static int pmem_major;
 static atomic_t pmem_index;
 
-static int pmem_getgeo(struct block_device *bd, struct hd_geometry *geo)
-{
-	/* some standard values */
-	geo->heads = 1 << 6;
-	geo->sectors = 1 << 5;
-	geo->cylinders = get_capacity(bd->bd_disk) >> 11;
-	return 0;
-}
-
-/*
- * direct translation from (pmem,sector) => void*
- * We do not require that sector be page aligned.
- * The return value will point to the beginning of the page containing the
- * given sector, not to the sector itself.
- */
-static void *pmem_lookup_pg_addr(struct pmem_device *pmem, sector_t sector)
-{
-	size_t page_offset = sector >> PAGE_SECTORS_SHIFT;
-	size_t offset = page_offset << PAGE_SHIFT;
-
-	BUG_ON(offset >= pmem->size);
-	return pmem->virt_addr + offset;
-}
-
-/* sector must be page aligned */
-static unsigned long pmem_lookup_pfn(struct pmem_device *pmem, sector_t sector)
-{
-	size_t page_offset = sector >> PAGE_SECTORS_SHIFT;
-
-	BUG_ON(sector & (PAGE_SECTORS - 1));
-	return (pmem->phys_addr >> PAGE_SHIFT) + page_offset;
-}
-
-/*
- * sector is not required to be page aligned.
- * n is at most a single page, but could be less.
- */
-static void copy_to_pmem(struct pmem_device *pmem, const void *src,
-			sector_t sector, size_t n)
-{
-	void *dst;
-	unsigned int offset = (sector & (PAGE_SECTORS - 1)) << SECTOR_SHIFT;
-	size_t copy;
-
-	BUG_ON(n > PAGE_SIZE);
-
-	copy = min_t(size_t, n, PAGE_SIZE - offset);
-	dst = pmem_lookup_pg_addr(pmem, sector);
-	memcpy(dst + offset, src, copy);
-
-	if (copy < n) {
-		src += copy;
-		sector += copy >> SECTOR_SHIFT;
-		copy = n - copy;
-		dst = pmem_lookup_pg_addr(pmem, sector);
-		memcpy(dst, src, copy);
-	}
-}
-
-/*
- * sector is not required to be page aligned.
- * n is at most a single page, but could be less.
- */
-static void copy_from_pmem(void *dst, struct pmem_device *pmem,
-			  sector_t sector, size_t n)
-{
-	void *src;
-	unsigned int offset = (sector & (PAGE_SECTORS - 1)) << SECTOR_SHIFT;
-	size_t copy;
-
-	BUG_ON(n > PAGE_SIZE);
-
-	copy = min_t(size_t, n, PAGE_SIZE - offset);
-	src = pmem_lookup_pg_addr(pmem, sector);
-
-	memcpy(dst, src + offset, copy);
-
-	if (copy < n) {
-		dst += copy;
-		sector += copy >> SECTOR_SHIFT;
-		copy = n - copy;
-		src = pmem_lookup_pg_addr(pmem, sector);
-		memcpy(dst, src, copy);
-	}
-}
-
 static void pmem_do_bvec(struct pmem_device *pmem, struct page *page,
 			unsigned int len, unsigned int off, int rw,
 			sector_t sector)
 {
 	void *mem = kmap_atomic(page);
+	size_t pmem_off = sector << 9;
+
+	BUG_ON(pmem_off >= pmem->size);
 
 	if (rw == READ) {
-		copy_from_pmem(mem + off, pmem, sector, len);
+		memcpy(mem + off, pmem->virt_addr + pmem_off, len);
 		flush_dcache_page(page);
 	} else {
 		/*
@@ -145,7 +57,7 @@  static void pmem_do_bvec(struct pmem_device *pmem, struct page *page,
 		 * NVDIMMs are actually durable before returning.
 		 */
 		flush_dcache_page(page);
-		copy_to_pmem(pmem, mem + off, sector, len);
+		memcpy(pmem->virt_addr + pmem_off, mem + off, len);
 	}
 
 	kunmap_atomic(mem);
@@ -161,25 +73,32 @@  static void pmem_make_request(struct request_queue *q, struct bio *bio)
 	struct bvec_iter iter;
 	int err = 0;
 
-	sector = bio->bi_iter.bi_sector;
-	if (bio_end_sector(bio) > get_capacity(bdev->bd_disk)) {
+	if (unlikely(bio_end_sector(bio) > get_capacity(bdev->bd_disk))) {
 		err = -EIO;
 		goto out;
 	}
 
-	BUG_ON(bio->bi_rw & REQ_DISCARD);
+	if (WARN_ON(bio->bi_rw & REQ_DISCARD)) {
+		err = -EINVAL;
+		goto out;
+	}
 
 	rw = bio_rw(bio);
 	if (rw == READA)
 		rw = READ;
 
+	sector = bio->bi_iter.bi_sector;
 	bio_for_each_segment(bvec, bio, iter) {
-		unsigned int len = bvec.bv_len;
-
-		BUG_ON(len > PAGE_SIZE);
-		pmem_do_bvec(pmem, bvec.bv_page, len,
-			    bvec.bv_offset, rw, sector);
-		sector += len >> SECTOR_SHIFT;
+		/* NOTE: There is a legend saying that bv_len might be
+		 * bigger than PAGE_SIZE in the case that bv_page points to
+		 * a physical contiguous PFN set. But for us it is fine because
+		 * it means the Kernel virtual mapping is also contiguous. And
+		 * on the pmem side we are always contiguous both virtual and
+		 * physical
+		 */
+		pmem_do_bvec(pmem, bvec.bv_page, bvec.bv_len, bvec.bv_offset,
+			     rw, sector);
+		sector += bvec.bv_len >> 9;
 	}
 
 out:
@@ -200,21 +119,21 @@  static long pmem_direct_access(struct block_device *bdev, sector_t sector,
 			      void **kaddr, unsigned long *pfn, long size)
 {
 	struct pmem_device *pmem = bdev->bd_disk->private_data;
+	size_t offset = sector << 9;
 
-	if (!pmem)
+	if (unlikely(!pmem))
 		return -ENODEV;
 
-	*kaddr = pmem_lookup_pg_addr(pmem, sector);
-	*pfn = pmem_lookup_pfn(pmem, sector);
+	*kaddr = pmem->virt_addr + offset;
+	*pfn = (pmem->phys_addr + offset) >> PAGE_SHIFT;
 
-	return pmem->size - (sector * 512);
+	return pmem->size - offset;
 }
 
 static const struct block_device_operations pmem_fops = {
 	.owner =		THIS_MODULE,
 	.rw_page =		pmem_rw_page,
 	.direct_access =	pmem_direct_access,
-	.getgeo =		pmem_getgeo,
 };
 
 /* pmem->phys_addr and pmem->size need to be set.
@@ -307,7 +226,7 @@  static int pmem_probe(struct platform_device *pdev)
 	disk->flags		= GENHD_FL_EXT_DEVT;
 	sprintf(disk->disk_name, "pmem%d", idx);
 	disk->driverfs_dev = &pdev->dev;
-	set_capacity(disk, pmem->size >> SECTOR_SHIFT);
+	set_capacity(disk, pmem->size >> 9);
 	pmem->pmem_disk = disk;
 
 	add_disk(disk);