diff mbox

[v5] DAX: enable iostat for read/write

Message ID 20170112183848.23159-1-toshi.kani@hpe.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kani, Toshi Jan. 12, 2017, 6:38 p.m. UTC
DAX IO path does not support iostat, but its metadata IO path does.
Therefore, iostat shows metadata IO statistics only, which has been
confusing to users.

Add iostat support to the DAX read/write path.

Note, iostat still does not support the DAX mmap path as it allows
user applications to access directly.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Joe Perches <joe@perches.com>
---
v5:
 - Add a flag in case 'start' is 0 after 'jiffies' rolls over.
   (Dan Williams)
 - Fix a signed/unsigned conversion. (Joe Perches)
---
 fs/dax.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Joe Perches Jan. 12, 2017, 6:02 p.m. UTC | #1
On Thu, 2017-01-12 at 11:38 -0700, Toshi Kani wrote:
> DAX IO path does not support iostat, but its metadata IO path does.
> Therefore, iostat shows metadata IO statistics only, which has been
> confusing to users.
[]
> diff --git a/fs/dax.c b/fs/dax.c
[]
> @@ -1058,12 +1058,24 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
>  {
>  	struct address_space *mapping = iocb->ki_filp->f_mapping;
>  	struct inode *inode = mapping->host;
> +	struct gendisk *disk = inode->i_sb->s_bdev->bd_disk;
>  	loff_t pos = iocb->ki_pos, ret = 0, done = 0;
>  	unsigned flags = 0;
> +	unsigned long start = 0;
> +	int do_acct = blk_queue_io_stat(disk->queue);
>  
>  	if (iov_iter_rw(iter) == WRITE)
>  		flags |= IOMAP_WRITE;
>  
> +	if (do_acct) {
> +		sector_t sec = iov_iter_count(iter) >> 9;
> +
> +		start = jiffies;
> +		generic_start_io_acct(iov_iter_rw(iter),
> +				      min_t(unsigned long, 1, sec),

I believe I mislead you with a thinko.

Your original code was
	(!sec) ? 1 : sec
and I suggested incorrectly using min_t

It should of course be max_t.  Sorry.

Also, as sec is now sector_t (u64), perhaps this
unsigned long cast is incorrect.


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kani, Toshi Jan. 12, 2017, 6:26 p.m. UTC | #2
T24gVGh1LCAyMDE3LTAxLTEyIGF0IDEwOjAyIC0wODAwLCBKb2UgUGVyY2hlcyB3cm90ZToNCj4g
T24gVGh1LCAyMDE3LTAxLTEyIGF0IDExOjM4IC0wNzAwLCBUb3NoaSBLYW5pIHdyb3RlOg0KPiA+
IERBWCBJTyBwYXRoIGRvZXMgbm90IHN1cHBvcnQgaW9zdGF0LCBidXQgaXRzIG1ldGFkYXRhIElP
IHBhdGggZG9lcy4NCj4gPiBUaGVyZWZvcmUsIGlvc3RhdCBzaG93cyBtZXRhZGF0YSBJTyBzdGF0
aXN0aWNzIG9ubHksIHdoaWNoIGhhcyBiZWVuDQo+ID4gY29uZnVzaW5nIHRvIHVzZXJzLg0KPiAN
Cj4gW10NCj4gPiBkaWZmIC0tZ2l0IGEvZnMvZGF4LmMgYi9mcy9kYXguYw0KPiANCj4gW10NCj4g
PiBAQCAtMTA1OCwxMiArMTA1OCwyNCBAQCBkYXhfaW9tYXBfcncoc3RydWN0IGtpb2NiICppb2Ni
LCBzdHJ1Y3QNCj4gPiBpb3ZfaXRlciAqaXRlciwNCj4gPiDCoHsNCj4gPiDCoAlzdHJ1Y3QgYWRk
cmVzc19zcGFjZSAqbWFwcGluZyA9IGlvY2ItPmtpX2ZpbHAtPmZfbWFwcGluZzsNCj4gPiDCoAlz
dHJ1Y3QgaW5vZGUgKmlub2RlID0gbWFwcGluZy0+aG9zdDsNCj4gPiArCXN0cnVjdCBnZW5kaXNr
ICpkaXNrID0gaW5vZGUtPmlfc2ItPnNfYmRldi0+YmRfZGlzazsNCj4gPiDCoAlsb2ZmX3QgcG9z
ID0gaW9jYi0+a2lfcG9zLCByZXQgPSAwLCBkb25lID0gMDsNCj4gPiDCoAl1bnNpZ25lZCBmbGFn
cyA9IDA7DQo+ID4gKwl1bnNpZ25lZCBsb25nIHN0YXJ0ID0gMDsNCj4gPiArCWludCBkb19hY2N0
ID0gYmxrX3F1ZXVlX2lvX3N0YXQoZGlzay0+cXVldWUpOw0KPiA+IMKgDQo+ID4gwqAJaWYgKGlv
dl9pdGVyX3J3KGl0ZXIpID09IFdSSVRFKQ0KPiA+IMKgCQlmbGFncyB8PSBJT01BUF9XUklURTsN
Cj4gPiDCoA0KPiA+ICsJaWYgKGRvX2FjY3QpIHsNCj4gPiArCQlzZWN0b3JfdCBzZWMgPSBpb3Zf
aXRlcl9jb3VudChpdGVyKSA+PiA5Ow0KPiA+ICsNCj4gPiArCQlzdGFydCA9IGppZmZpZXM7DQo+
ID4gKwkJZ2VuZXJpY19zdGFydF9pb19hY2N0KGlvdl9pdGVyX3J3KGl0ZXIpLA0KPiA+ICsJCQkJ
wqDCoMKgwqDCoMKgbWluX3QodW5zaWduZWQgbG9uZywgMSwNCj4gPiBzZWMpLA0KPiANCj4gSSBi
ZWxpZXZlIEkgbWlzbGVhZCB5b3Ugd2l0aCBhIHRoaW5rby4NCj4gDQo+IFlvdXIgb3JpZ2luYWwg
Y29kZSB3YXMNCj4gCSghc2VjKSA/IDEgOiBzZWMNCj4gYW5kIEkgc3VnZ2VzdGVkIGluY29ycmVj
dGx5IHVzaW5nIG1pbl90DQo+IA0KPiBJdCBzaG91bGQgb2YgY291cnNlIGJlIG1heF90LsKgwqBT
b3JyeS4NCg0KTXkgYmFkLiBJIHNob3VsZCBoYXZlIGNhdWdodCBpdC4NCg0KPiBBbHNvLCBhcyBz
ZWMgaXMgbm93IHNlY3Rvcl90ICh1NjQpLCBwZXJoYXBzIHRoaXMNCj4gdW5zaWduZWQgbG9uZyBj
YXN0IGlzIGluY29ycmVjdC4NCg0KSSBzZWUuIFNpbmNlIGlvdl9pdGVyX2NvdW50KCkgcmV0dXJu
cyBhIHNpemVfdCB2YWx1ZSwgSSB3aWxsIHVzZQ0KJ3NpemVfdCcgZm9yICdzZWMnIGFzIHlvdSBv
cmlnaW5hbGx5IHN1Z2dlc3RlZC4gDQoNClRoYW5rcywNCi1Ub3NoaQ0K
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Jan. 25, 2017, 11:19 a.m. UTC | #3
On Thu, Jan 12, 2017 at 11:38:48AM -0700, Toshi Kani wrote:
> DAX IO path does not support iostat, but its metadata IO path does.
> Therefore, iostat shows metadata IO statistics only, which has been
> confusing to users.
> 
> Add iostat support to the DAX read/write path.
> 
> Note, iostat still does not support the DAX mmap path as it allows
> user applications to access directly.

NAK.  DAX I/O should not be accounted for block device statistics.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Moyer Jan. 25, 2017, 3:15 p.m. UTC | #4
Christoph Hellwig <hch@infradead.org> writes:

> On Thu, Jan 12, 2017 at 11:38:48AM -0700, Toshi Kani wrote:
>> DAX IO path does not support iostat, but its metadata IO path does.
>> Therefore, iostat shows metadata IO statistics only, which has been
>> confusing to users.
>> 
>> Add iostat support to the DAX read/write path.
>> 
>> Note, iostat still does not support the DAX mmap path as it allows
>> user applications to access directly.
>
> NAK.  DAX I/O should not be accounted for block device statistics.

Agreed, this is a layering violation.

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kani, Toshi Jan. 25, 2017, 3:55 p.m. UTC | #5
T24gV2VkLCAyMDE3LTAxLTI1IGF0IDEwOjE1IC0wNTAwLCBKZWZmIE1veWVyIHdyb3RlOg0KPiBD
aHJpc3RvcGggSGVsbHdpZyA8aGNoQGluZnJhZGVhZC5vcmc+IHdyaXRlczoNCj4gDQo+ID4gT24g
VGh1LCBKYW4gMTIsIDIwMTcgYXQgMTE6Mzg6NDhBTSAtMDcwMCwgVG9zaGkgS2FuaSB3cm90ZToN
Cj4gPiA+IERBWCBJTyBwYXRoIGRvZXMgbm90IHN1cHBvcnQgaW9zdGF0LCBidXQgaXRzIG1ldGFk
YXRhIElPIHBhdGgNCj4gPiA+IGRvZXMuIFRoZXJlZm9yZSwgaW9zdGF0IHNob3dzIG1ldGFkYXRh
IElPIHN0YXRpc3RpY3Mgb25seSwgd2hpY2gNCj4gPiA+IGhhcyBiZWVuIGNvbmZ1c2luZyB0byB1
c2Vycy4NCj4gPiA+IA0KPiA+ID4gQWRkIGlvc3RhdCBzdXBwb3J0IHRvIHRoZSBEQVggcmVhZC93
cml0ZSBwYXRoLg0KPiA+ID4gDQo+ID4gPiBOb3RlLCBpb3N0YXQgc3RpbGwgZG9lcyBub3Qgc3Vw
cG9ydCB0aGUgREFYIG1tYXAgcGF0aCBhcyBpdA0KPiA+ID4gYWxsb3dzIHVzZXIgYXBwbGljYXRp
b25zIHRvIGFjY2VzcyBkaXJlY3RseS4NCj4gPiANCj4gPiBOQUsuwqDCoERBWCBJL08gc2hvdWxk
IG5vdCBiZSBhY2NvdW50ZWQgZm9yIGJsb2NrIGRldmljZSBzdGF0aXN0aWNzLg0KPiANCj4gQWdy
ZWVkLCB0aGlzIGlzIGEgbGF5ZXJpbmcgdmlvbGF0aW9uLg0KDQpJIHdpbGwgY2hlY2sgdG8gc2Vl
IGlmIGl0IGNhbiBmaXQgb24gdG9wIG9mIERhbidzIHBhdGNoLXNldC4NCg0KVGhhbmtzLA0KLVRv
c2hp
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/dax.c b/fs/dax.c
index 5c74f60..a3e406a 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1058,12 +1058,24 @@  dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
 {
 	struct address_space *mapping = iocb->ki_filp->f_mapping;
 	struct inode *inode = mapping->host;
+	struct gendisk *disk = inode->i_sb->s_bdev->bd_disk;
 	loff_t pos = iocb->ki_pos, ret = 0, done = 0;
 	unsigned flags = 0;
+	unsigned long start = 0;
+	int do_acct = blk_queue_io_stat(disk->queue);
 
 	if (iov_iter_rw(iter) == WRITE)
 		flags |= IOMAP_WRITE;
 
+	if (do_acct) {
+		sector_t sec = iov_iter_count(iter) >> 9;
+
+		start = jiffies;
+		generic_start_io_acct(iov_iter_rw(iter),
+				      min_t(unsigned long, 1, sec),
+				      &disk->part0);
+	}
+
 	while (iov_iter_count(iter)) {
 		ret = iomap_apply(inode, pos, iov_iter_count(iter), flags, ops,
 				iter, dax_iomap_actor);
@@ -1073,6 +1085,9 @@  dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
 		done += ret;
 	}
 
+	if (do_acct)
+		generic_end_io_acct(iov_iter_rw(iter), &disk->part0, start);
+
 	iocb->ki_pos += done;
 	return done ? done : ret;
 }