diff mbox

[v4] DAX: enable iostat for read/write

Message ID 20170111001122.10826-1-toshi.kani@hpe.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kani, Toshi Jan. 11, 2017, 12:11 a.m. UTC
DAX IO path does not support iostat, but its metadata IO path does.
Therefore, iostat shows metadata IO statistics only, which has been
confusing to users.

Add iostat support to the DAX read/write path.

Note, iostat still does not support the DAX mmap path as it allows
user applications to access directly.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
---
v4: Rebased to 4.10, applied the v3 change to new dax_iomap_rw().
---
 fs/dax.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Dan Williams Jan. 10, 2017, 11:41 p.m. UTC | #1
On Tue, Jan 10, 2017 at 4:11 PM, Toshi Kani <toshi.kani@hpe.com> wrote:
> DAX IO path does not support iostat, but its metadata IO path does.
> Therefore, iostat shows metadata IO statistics only, which has been
> confusing to users.
>
> Add iostat support to the DAX read/write path.
>
> Note, iostat still does not support the DAX mmap path as it allows
> user applications to access directly.
>
> Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
> ---
> v4: Rebased to 4.10, applied the v3 change to new dax_iomap_rw().
> ---
>  fs/dax.c |   13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 5c74f60..4d5f4c0 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -1058,12 +1058,22 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
>  {
>         struct address_space *mapping = iocb->ki_filp->f_mapping;
>         struct inode *inode = mapping->host;
> +       struct gendisk *disk = inode->i_sb->s_bdev->bd_disk;
>         loff_t pos = iocb->ki_pos, ret = 0, done = 0;
>         unsigned flags = 0;
> +       unsigned long start = 0;
>
>         if (iov_iter_rw(iter) == WRITE)
>                 flags |= IOMAP_WRITE;
>
> +       if (blk_queue_io_stat(disk->queue)) {
> +               int sec = iov_iter_count(iter) >> 9;
> +
> +               start = jiffies;
> +               generic_start_io_acct(iov_iter_rw(iter),
> +                                     (!sec) ? 1 : sec, &disk->part0);
> +       }
> +
>         while (iov_iter_count(iter)) {
>                 ret = iomap_apply(inode, pos, iov_iter_count(iter), flags, ops,
>                                 iter, dax_iomap_actor);
> @@ -1073,6 +1083,9 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
>                 done += ret;
>         }
>
> +       if (start)
> +               generic_end_io_acct(iov_iter_rw(iter), &disk->part0, start);
> +

I think we can afford to add a separate flag that indicates whether we
called generic_start_io_acct(). Just in case 'start' is '0' after
'jiffies' rolls over.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kani, Toshi Jan. 10, 2017, 11:54 p.m. UTC | #2
On Tue, 2017-01-10 at 15:41 -0800, Dan Williams wrote:
> On Tue, Jan 10, 2017 at 4:11 PM, Toshi Kani <toshi.kani@hpe.com>

> wrote:

 :
> > 

> > +       if (blk_queue_io_stat(disk->queue)) {

> > +               int sec = iov_iter_count(iter) >> 9;

> > +

> > +               start = jiffies;

> > +               generic_start_io_acct(iov_iter_rw(iter),

> > +                                     (!sec) ? 1 : sec, &disk-

> > >part0);

> > +       }

> > +

> >         while (iov_iter_count(iter)) {

> >                 ret = iomap_apply(inode, pos, iov_iter_count(iter),

> > flags, ops,

> >                                 iter, dax_iomap_actor);

> > @@ -1073,6 +1083,9 @@ dax_iomap_rw(struct kiocb *iocb, struct

> > iov_iter *iter,

> >                 done += ret;

> >         }

> > 

> > +       if (start)

> > +               generic_end_io_acct(iov_iter_rw(iter), &disk-

> > >part0, start);

> > +

> 

> I think we can afford to add a separate flag that indicates whether

> we called generic_start_io_acct(). Just in case 'start' is '0' after

> 'jiffies' rolls over.


Good point. I will add a flag to account such a case.

Thanks,
-Toshi
Joe Perches Jan. 11, 2017, 4:36 a.m. UTC | #3
On Tue, 2017-01-10 at 17:11 -0700, Toshi Kani wrote:
> DAX IO path does not support iostat, but its metadata IO path does.
> Therefore, iostat shows metadata IO statistics only, which has been
> confusing to users.
[]
> diff --git a/fs/dax.c b/fs/dax.c
[]
> @@ -1058,12 +1058,22 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
[]
> +	if (blk_queue_io_stat(disk->queue)) {
> +		int sec = iov_iter_count(iter) >> 9;
> +
> +		start = jiffies;
> +		generic_start_io_acct(iov_iter_rw(iter),
> +				      (!sec) ? 1 : sec, &disk->part0);
> +	}

There is a signed/unsigned conversion of sec
It may be better to use something like:

		size_t sec  = iov_iter_count(iter) >> 9;
		[...]
		generic_start_io_acct(iov_iter_rw(iter),
				      min_t(unsigned long, 1, sec),
				      &disk->part0);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kani, Toshi Jan. 11, 2017, 4:29 p.m. UTC | #4
T24gVHVlLCAyMDE3LTAxLTEwIGF0IDIwOjM2IC0wODAwLCBKb2UgUGVyY2hlcyB3cm90ZToNCj4g
PiANCk9uIFR1ZSwgMjAxNy0wMS0xMCBhdCAxNzoxMSAtMDcwMCwgVG9zaGkgS2FuaSB3cm90ZToN
Cj4gPiBEQVggSU8gcGF0aCBkb2VzIG5vdCBzdXBwb3J0IGlvc3RhdCwgYnV0IGl0cyBtZXRhZGF0
YSBJTyBwYXRoIGRvZXMuDQo+ID4gVGhlcmVmb3JlLCBpb3N0YXQgc2hvd3MgbWV0YWRhdGEgSU8g
c3RhdGlzdGljcyBvbmx5LCB3aGljaCBoYXMgYmVlbg0KPiA+IGNvbmZ1c2luZyB0byB1c2Vycy4N
Cj4gDQo+IFtdDQo+ID4gZGlmZiAtLWdpdCBhL2ZzL2RheC5jIGIvZnMvZGF4LmMNCj4gDQo+IFtd
DQo+ID4gQEAgLTEwNTgsMTIgKzEwNTgsMjIgQEAgZGF4X2lvbWFwX3J3KHN0cnVjdCBraW9jYiAq
aW9jYiwgc3RydWN0DQo+ID4gaW92X2l0ZXIgKml0ZXIsDQo+IA0KPiBbXQ0KPiA+ICsJaWYgKGJs
a19xdWV1ZV9pb19zdGF0KGRpc2stPnF1ZXVlKSkgew0KPiA+ICsJCWludCBzZWMgPSBpb3ZfaXRl
cl9jb3VudChpdGVyKSA+PiA5Ow0KPiA+ICsNCj4gPiArCQlzdGFydCA9IGppZmZpZXM7DQo+ID4g
KwkJZ2VuZXJpY19zdGFydF9pb19hY2N0KGlvdl9pdGVyX3J3KGl0ZXIpLA0KPiA+ICsJCQkJwqDC
oMKgwqDCoMKgKCFzZWMpID8gMSA6IHNlYywgJmRpc2stDQo+ID4gPnBhcnQwKTsNCj4gPiArCX0N
Cj4gDQo+IFRoZXJlIGlzIGEgc2lnbmVkL3Vuc2lnbmVkIGNvbnZlcnNpb24gb2Ygc2VjDQo+IEl0
IG1heSBiZSBiZXR0ZXIgdG8gdXNlIHNvbWV0aGluZyBsaWtlOg0KPiANCj4gCQlzaXplX3Qgc2Vj
wqDCoD0gaW92X2l0ZXJfY291bnQoaXRlcikgPj4gOTsNCj4gCQlbLi4uXQ0KPiAJCWdlbmVyaWNf
c3RhcnRfaW9fYWNjdChpb3ZfaXRlcl9ydyhpdGVyKSwNCj4gCQkJCcKgwqDCoMKgwqDCoG1pbl90
KHVuc2lnbmVkIGxvbmcsIDEsIHNlYyksDQo+IAkJCQnCoMKgwqDCoMKgwqAmZGlzay0+cGFydDAp
Ow0KDQpHb29kIGNhdGNoLiBJIHdpbGwgY2hhbmdlIGFzIHlvdSBzdWdnZXN0ZWQsIGFuZCB1c2Ug
J3NlY3Rvcl90JyBmb3INCidzZWMnLiANCg0KVGhhbmtzLA0KLVRvc2hp
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/dax.c b/fs/dax.c
index 5c74f60..4d5f4c0 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1058,12 +1058,22 @@  dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
 {
 	struct address_space *mapping = iocb->ki_filp->f_mapping;
 	struct inode *inode = mapping->host;
+	struct gendisk *disk = inode->i_sb->s_bdev->bd_disk;
 	loff_t pos = iocb->ki_pos, ret = 0, done = 0;
 	unsigned flags = 0;
+	unsigned long start = 0;
 
 	if (iov_iter_rw(iter) == WRITE)
 		flags |= IOMAP_WRITE;
 
+	if (blk_queue_io_stat(disk->queue)) {
+		int sec = iov_iter_count(iter) >> 9;
+
+		start = jiffies;
+		generic_start_io_acct(iov_iter_rw(iter),
+				      (!sec) ? 1 : sec, &disk->part0);
+	}
+
 	while (iov_iter_count(iter)) {
 		ret = iomap_apply(inode, pos, iov_iter_count(iter), flags, ops,
 				iter, dax_iomap_actor);
@@ -1073,6 +1083,9 @@  dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
 		done += ret;
 	}
 
+	if (start)
+		generic_end_io_acct(iov_iter_rw(iter), &disk->part0, start);
+
 	iocb->ki_pos += done;
 	return done ? done : ret;
 }