From patchwork Tue Jan 28 13:02:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11354245 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5403A138C for ; Tue, 28 Jan 2020 13:04:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1E60624683 for ; Tue, 28 Jan 2020 13:04:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="F/YyiDrn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726190AbgA1NEj (ORCPT ); Tue, 28 Jan 2020 08:04:39 -0500 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:28044 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726034AbgA1NEj (ORCPT ); Tue, 28 Jan 2020 08:04:39 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580216676; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3DpaiHybq/mDJdOJRa6ck4kmfMjNKlKdJFR2TYXL9Is=; b=F/YyiDrnYFhkEFsotIlgxEhNlq7oz99g0yp3ok02xnT1qGV0n0DYauYfvymUxzPJvtSvej pI58tdgmXtogZU58RCi/FyG8jrvTKlRKorPnTuB+AW7VtkRMmPLyvUcD7vvupeeSAz3xzF v2vkqw5QCWk+mEnRVauPTgg2jooa2Lg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-158-OPw75d5DOMSJqcdlzaEHjw-1; Tue, 28 Jan 2020 08:03:04 -0500 X-MC-Unique: OPw75d5DOMSJqcdlzaEHjw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A20E213E4; Tue, 28 Jan 2020 13:03:03 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-34.pek2.redhat.com [10.72.12.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id DD6AB80A5C; Tue, 28 Jan 2020 13:03:00 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com, zyan@redhat.com Cc: sage@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v5 01/10] ceph: add caps perf metric for each session Date: Tue, 28 Jan 2020 08:02:39 -0500 Message-Id: <20200128130248.4266-2-xiubli@redhat.com> In-Reply-To: <20200128130248.4266-1-xiubli@redhat.com> References: <20200128130248.4266-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li This will fulfill the caps hit/miss metric for each session. When checking the "need" mask and if one cap has the subset of the "need" mask it means hit, or missed. item total miss hit ------------------------------------------------- d_lease 295 0 993 session caps miss hit ------------------------------------------------- 0 295 107 4119 1 1 107 9 URL: https://tracker.ceph.com/issues/43215 Signed-off-by: Xiubo Li --- fs/ceph/acl.c | 2 ++ fs/ceph/addr.c | 2 ++ fs/ceph/caps.c | 74 ++++++++++++++++++++++++++++++++++++++++++++ fs/ceph/debugfs.c | 20 ++++++++++++ fs/ceph/dir.c | 9 ++++-- fs/ceph/file.c | 3 ++ fs/ceph/mds_client.c | 16 +++++++++- fs/ceph/mds_client.h | 3 ++ fs/ceph/quota.c | 9 ++++-- fs/ceph/super.h | 11 +++++++ fs/ceph/xattr.c | 17 ++++++++-- 11 files changed, 158 insertions(+), 8 deletions(-) diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c index 26be6520d3fb..58e119e3519f 100644 --- a/fs/ceph/acl.c +++ b/fs/ceph/acl.c @@ -22,6 +22,8 @@ static inline void ceph_set_cached_acl(struct inode *inode, struct ceph_inode_info *ci = ceph_inode(inode); spin_lock(&ci->i_ceph_lock); + __ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED); + if (__ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 0)) set_cached_acl(inode, type, acl); else diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 7ab616601141..29d4513eff8c 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -1706,6 +1706,8 @@ int ceph_uninline_data(struct file *filp, struct page *locked_page) err = -ENOMEM; goto out; } + + ceph_caps_metric(ci, CEPH_STAT_CAP_INLINE_DATA); err = __ceph_do_getattr(inode, page, CEPH_STAT_CAP_INLINE_DATA, true); if (err < 0) { diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 7fc87b693ba4..af2e9e826f8c 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -783,6 +783,75 @@ static int __cap_is_valid(struct ceph_cap *cap) return 1; } +/* + * Counts the cap metric. + */ +void __ceph_caps_metric(struct ceph_inode_info *ci, int mask) +{ + int have = ci->i_snap_caps; + struct ceph_mds_session *s; + struct ceph_cap *cap; + struct rb_node *p; + bool skip_auth = false; + + lockdep_assert_held(&ci->i_ceph_lock); + + if (mask <= 0) + return; + + /* Counts the snap caps metric in the auth cap */ + if (ci->i_auth_cap) { + cap = ci->i_auth_cap; + if (have) { + have |= cap->issued; + + dout("%s %p cap %p issued %s, mask %s\n", __func__, + &ci->vfs_inode, cap, ceph_cap_string(cap->issued), + ceph_cap_string(mask)); + + s = ceph_get_mds_session(cap->session); + if (s) { + if (mask & have) + percpu_counter_inc(&s->i_caps_hit); + else + percpu_counter_inc(&s->i_caps_mis); + ceph_put_mds_session(s); + } + skip_auth = true; + } + } + + if ((mask & have) == mask) + return; + + /* Checks others */ + for (p = rb_first(&ci->i_caps); p; p = rb_next(p)) { + cap = rb_entry(p, struct ceph_cap, ci_node); + if (!__cap_is_valid(cap)) + continue; + + if (skip_auth && cap == ci->i_auth_cap) + continue; + + dout("%s %p cap %p issued %s, mask %s\n", __func__, + &ci->vfs_inode, cap, ceph_cap_string(cap->issued), + ceph_cap_string(mask)); + + s = ceph_get_mds_session(cap->session); + if (s) { + if (mask & cap->issued) + percpu_counter_inc(&s->i_caps_hit); + else + percpu_counter_inc(&s->i_caps_mis); + ceph_put_mds_session(s); + } + + have |= cap->issued; + if ((mask & have) == mask) + return; + } +} + /* * Return set of valid cap bits issued to us. Note that caps time * out, and may be invalidated in bulk if the client session times out @@ -2746,6 +2815,7 @@ static void check_max_size(struct inode *inode, loff_t endoff) int ceph_try_get_caps(struct inode *inode, int need, int want, bool nonblock, int *got) { + struct ceph_inode_info *ci = ceph_inode(inode); int ret; BUG_ON(need & ~CEPH_CAP_FILE_RD); @@ -2758,6 +2828,7 @@ int ceph_try_get_caps(struct inode *inode, int need, int want, BUG_ON(want & ~(CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_LAZYIO | CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_EXCL | CEPH_CAP_ANY_DIR_OPS)); + ceph_caps_metric(ci, need | want); ret = try_get_cap_refs(inode, need, want, 0, nonblock, got); return ret == -EAGAIN ? 0 : ret; } @@ -2784,6 +2855,8 @@ int ceph_get_caps(struct file *filp, int need, int want, fi->filp_gen != READ_ONCE(fsc->filp_gen)) return -EBADF; + ceph_caps_metric(ci, need | want); + while (true) { if (endoff > 0) check_max_size(inode, endoff); @@ -2871,6 +2944,7 @@ int ceph_get_caps(struct file *filp, int need, int want, * getattr request will bring inline data into * page cache */ + ceph_caps_metric(ci, CEPH_STAT_CAP_INLINE_DATA); ret = __ceph_do_getattr(inode, NULL, CEPH_STAT_CAP_INLINE_DATA, true); diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c index 40a22da0214a..c132fdb40d53 100644 --- a/fs/ceph/debugfs.c +++ b/fs/ceph/debugfs.c @@ -128,6 +128,7 @@ static int metric_show(struct seq_file *s, void *p) { struct ceph_fs_client *fsc = s->private; struct ceph_mds_client *mdsc = fsc->mdsc; + int i; seq_printf(s, "item total miss hit\n"); seq_printf(s, "-------------------------------------------------\n"); @@ -137,6 +138,25 @@ static int metric_show(struct seq_file *s, void *p) percpu_counter_sum(&mdsc->metric.d_lease_mis), percpu_counter_sum(&mdsc->metric.d_lease_hit)); + seq_printf(s, "\n"); + seq_printf(s, "session caps miss hit\n"); + seq_printf(s, "-------------------------------------------------\n"); + + mutex_lock(&mdsc->mutex); + for (i = 0; i < mdsc->max_sessions; i++) { + struct ceph_mds_session *session; + + session = __ceph_lookup_mds_session(mdsc, i); + if (!session) + continue; + seq_printf(s, "%-14d%-16d%-16lld%lld\n", i, + session->s_nr_caps, + percpu_counter_sum(&session->i_caps_mis), + percpu_counter_sum(&session->i_caps_hit)); + ceph_put_mds_session(session); + } + mutex_unlock(&mdsc->mutex); + return 0; } diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 658c55b323cc..33eb239e09e2 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -313,7 +313,7 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_mds_client *mdsc = fsc->mdsc; int i; - int err; + int err, ret = -1; unsigned frag = -1; struct ceph_mds_reply_info_parsed *rinfo; @@ -346,13 +346,16 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) !ceph_test_mount_opt(fsc, NOASYNCREADDIR) && ceph_snap(inode) != CEPH_SNAPDIR && __ceph_dir_is_complete_ordered(ci) && - __ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 1)) { + (ret = __ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 1))) { int shared_gen = atomic_read(&ci->i_shared_gen); + __ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED); spin_unlock(&ci->i_ceph_lock); err = __dcache_readdir(file, ctx, shared_gen); if (err != -EAGAIN) return err; } else { + if (ret != -1) + __ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED); spin_unlock(&ci->i_ceph_lock); } @@ -757,6 +760,8 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, struct ceph_dentry_info *di = ceph_dentry(dentry); spin_lock(&ci->i_ceph_lock); + __ceph_caps_metric(ci, CEPH_CAP_FILE_SHARED); + dout(" dir %p flags are %d\n", dir, ci->i_ceph_flags); if (strncmp(dentry->d_name.name, fsc->mount_options->snapdir_name, diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 1e6cdf2dfe90..c78dfbbb7b91 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -384,6 +384,8 @@ int ceph_open(struct inode *inode, struct file *file) * asynchronously. */ spin_lock(&ci->i_ceph_lock); + __ceph_caps_metric(ci, wanted); + if (__ceph_is_any_real_caps(ci) && (((fmode & CEPH_FILE_MODE_WR) == 0) || ci->i_auth_cap)) { int mds_wanted = __ceph_caps_mds_wanted(ci, true); @@ -1340,6 +1342,7 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to) return -ENOMEM; } + ceph_caps_metric(ci, CEPH_STAT_CAP_INLINE_DATA); statret = __ceph_do_getattr(inode, page, CEPH_STAT_CAP_INLINE_DATA, !!page); if (statret < 0) { diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index a24fd00676b8..141c1c03636c 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -558,6 +558,8 @@ void ceph_put_mds_session(struct ceph_mds_session *s) if (refcount_dec_and_test(&s->s_ref)) { if (s->s_auth.authorizer) ceph_auth_destroy_authorizer(s->s_auth.authorizer); + percpu_counter_destroy(&s->i_caps_hit); + percpu_counter_destroy(&s->i_caps_mis); kfree(s); } } @@ -598,6 +600,7 @@ static struct ceph_mds_session *register_session(struct ceph_mds_client *mdsc, int mds) { struct ceph_mds_session *s; + int err; if (mds >= mdsc->mdsmap->possible_max_rank) return ERR_PTR(-EINVAL); @@ -612,8 +615,10 @@ static struct ceph_mds_session *register_session(struct ceph_mds_client *mdsc, dout("%s: realloc to %d\n", __func__, newmax); sa = kcalloc(newmax, sizeof(void *), GFP_NOFS); - if (!sa) + if (!sa) { + err = -ENOMEM; goto fail_realloc; + } if (mdsc->sessions) { memcpy(sa, mdsc->sessions, mdsc->max_sessions * sizeof(void *)); @@ -653,6 +658,13 @@ static struct ceph_mds_session *register_session(struct ceph_mds_client *mdsc, INIT_LIST_HEAD(&s->s_cap_flushing); + err = percpu_counter_init(&s->i_caps_hit, 0, GFP_NOFS); + if (err) + goto fail_realloc; + err = percpu_counter_init(&s->i_caps_mis, 0, GFP_NOFS); + if (err) + goto fail_init; + mdsc->sessions[mds] = s; atomic_inc(&mdsc->num_sessions); refcount_inc(&s->s_ref); /* one ref to sessions[], one to caller */ @@ -662,6 +674,8 @@ static struct ceph_mds_session *register_session(struct ceph_mds_client *mdsc, return s; +fail_init: + percpu_counter_destroy(&s->i_caps_hit); fail_realloc: kfree(s); return ERR_PTR(-ENOMEM); diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index dd1f417b90eb..ba74ff74c59c 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -201,6 +201,9 @@ struct ceph_mds_session { struct list_head s_waiting; /* waiting requests */ struct list_head s_unsafe; /* unsafe requests */ + + struct percpu_counter i_caps_hit; + struct percpu_counter i_caps_mis; }; /* diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c index de56dee60540..4ce2f658e63d 100644 --- a/fs/ceph/quota.c +++ b/fs/ceph/quota.c @@ -147,9 +147,14 @@ static struct inode *lookup_quotarealm_inode(struct ceph_mds_client *mdsc, return NULL; } if (qri->inode) { + struct ceph_inode_info *ci = ceph_inode(qri->inode); + int ret; + + ceph_caps_metric(ci, CEPH_STAT_CAP_INODE); + /* get caps */ - int ret = __ceph_do_getattr(qri->inode, NULL, - CEPH_STAT_CAP_INODE, true); + ret = __ceph_do_getattr(qri->inode, NULL, + CEPH_STAT_CAP_INODE, true); if (ret >= 0) in = qri->inode; else diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 7af91628636c..3f4829222528 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -641,6 +641,14 @@ static inline bool __ceph_is_any_real_caps(struct ceph_inode_info *ci) return !RB_EMPTY_ROOT(&ci->i_caps); } +extern void __ceph_caps_metric(struct ceph_inode_info *ci, int mask); +static inline void ceph_caps_metric(struct ceph_inode_info *ci, int mask) +{ + spin_lock(&ci->i_ceph_lock); + __ceph_caps_metric(ci, mask); + spin_unlock(&ci->i_ceph_lock); +} + extern int __ceph_caps_issued(struct ceph_inode_info *ci, int *implemented); extern int __ceph_caps_issued_mask(struct ceph_inode_info *ci, int mask, int t); extern int __ceph_caps_issued_other(struct ceph_inode_info *ci, @@ -927,6 +935,9 @@ extern int __ceph_do_getattr(struct inode *inode, struct page *locked_page, int mask, bool force); static inline int ceph_do_getattr(struct inode *inode, int mask, bool force) { + struct ceph_inode_info *ci = ceph_inode(inode); + + ceph_caps_metric(ci, mask); return __ceph_do_getattr(inode, NULL, mask, force); } extern int ceph_permission(struct inode *inode, int mask); diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c index d58fa14c1f01..ebd522edb0a8 100644 --- a/fs/ceph/xattr.c +++ b/fs/ceph/xattr.c @@ -829,6 +829,7 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value, struct ceph_vxattr *vxattr = NULL; int req_mask; ssize_t err; + int ret = -1; /* let's see if a virtual xattr was requested */ vxattr = ceph_match_vxattr(inode, name); @@ -856,7 +857,9 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value, if (ci->i_xattrs.version == 0 || !((req_mask & CEPH_CAP_XATTR_SHARED) || - __ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1))) { + (ret = __ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1)))) { + if (ret != -1) + __ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED); spin_unlock(&ci->i_ceph_lock); /* security module gets xattr while filling trace */ @@ -871,6 +874,9 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value, if (err) return err; spin_lock(&ci->i_ceph_lock); + } else { + if (ret != -1) + __ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED); } err = __build_xattrs(inode); @@ -907,19 +913,24 @@ ssize_t ceph_listxattr(struct dentry *dentry, char *names, size_t size) struct ceph_inode_info *ci = ceph_inode(inode); bool len_only = (size == 0); u32 namelen; - int err; + int err, ret = -1; spin_lock(&ci->i_ceph_lock); dout("listxattr %p ver=%lld index_ver=%lld\n", inode, ci->i_xattrs.version, ci->i_xattrs.index_version); if (ci->i_xattrs.version == 0 || - !__ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1)) { + !(ret = __ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1))) { + if (ret != -1) + __ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED); spin_unlock(&ci->i_ceph_lock); err = ceph_do_getattr(inode, CEPH_STAT_CAP_XATTR, true); if (err) return err; spin_lock(&ci->i_ceph_lock); + } else { + if (ret != -1) + __ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED); } err = __build_xattrs(inode); From patchwork Tue Jan 28 13:02:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11354225 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 33E68138C for ; Tue, 28 Jan 2020 13:03:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0145324683 for ; Tue, 28 Jan 2020 13:03:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TIO3KPd9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726072AbgA1NDO (ORCPT ); Tue, 28 Jan 2020 08:03:14 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:39784 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725926AbgA1NDO (ORCPT ); Tue, 28 Jan 2020 08:03:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580216592; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6oZub7dYtpkVuiJqQ/eysBjiqkABCC/FaPm9IiQk+R0=; b=TIO3KPd9RLec2t/YMZMEHvxAA6/52JyiEkcatS/ljZ6r1xX/OF9IfFfIZhb8ZY44D8FNn3 aZgnLk2GSwmPNK3SnDOUxhAZM9g8J6jfQCEI7+PIpOtqUx/lmSBFvUc8GhWHBy1xW5Fnyn kgcHNvAhtnO6LIzEw9CTawI6cr3/BtM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-147--ldhFdJtPK6VZyMKcGGC8A-1; Tue, 28 Jan 2020 08:03:08 -0500 X-MC-Unique: -ldhFdJtPK6VZyMKcGGC8A-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id ED42610120A1; Tue, 28 Jan 2020 13:03:06 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-34.pek2.redhat.com [10.72.12.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id 39A7F88827; Tue, 28 Jan 2020 13:03:03 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com, zyan@redhat.com Cc: sage@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v5 02/10] ceph: move ceph_osdc_{read,write}pages to ceph.ko Date: Tue, 28 Jan 2020 08:02:40 -0500 Message-Id: <20200128130248.4266-3-xiubli@redhat.com> In-Reply-To: <20200128130248.4266-1-xiubli@redhat.com> References: <20200128130248.4266-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Since this helpers are only used by cpeh.ko, let's move it to ceph.ko and rename to _sync_. Signed-off-by: Xiubo Li --- fs/ceph/addr.c | 86 ++++++++++++++++++++++++++++++++- include/linux/ceph/osd_client.h | 17 ------- net/ceph/osd_client.c | 79 ------------------------------ 3 files changed, 84 insertions(+), 98 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 29d4513eff8c..20e5ebfff389 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -182,6 +182,47 @@ static int ceph_releasepage(struct page *page, gfp_t g) return !PagePrivate(page); } +/* + * Read some contiguous pages. If we cross a stripe boundary, shorten + * *plen. Return number of bytes read, or error. + */ +static int ceph_sync_readpages(struct ceph_fs_client *fsc, + struct ceph_vino vino, + struct ceph_file_layout *layout, + u64 off, u64 *plen, + u32 truncate_seq, u64 truncate_size, + struct page **pages, int num_pages, + int page_align) +{ + struct ceph_osd_client *osdc = &fsc->client->osdc; + struct ceph_osd_request *req; + int rc = 0; + + dout("readpages on ino %llx.%llx on %llu~%llu\n", vino.ino, + vino.snap, off, *plen); + req = ceph_osdc_new_request(osdc, layout, vino, off, plen, 0, 1, + CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ, + NULL, truncate_seq, truncate_size, + false); + if (IS_ERR(req)) + return PTR_ERR(req); + + /* it may be a short read due to an object boundary */ + osd_req_op_extent_osd_data_pages(req, 0, + pages, *plen, page_align, false, false); + + dout("readpages final extent is %llu~%llu (%llu bytes align %d)\n", + off, *plen, *plen, page_align); + + rc = ceph_osdc_start_request(osdc, req, false); + if (!rc) + rc = ceph_osdc_wait_request(osdc, req); + + ceph_osdc_put_request(req); + dout("readpages result %d\n", rc); + return rc; +} + /* * read a single page, without unlocking it. */ @@ -218,7 +259,7 @@ static int ceph_do_readpage(struct file *filp, struct page *page) dout("readpage inode %p file %p page %p index %lu\n", inode, filp, page, page->index); - err = ceph_osdc_readpages(&fsc->client->osdc, ceph_vino(inode), + err = ceph_sync_readpages(fsc, ceph_vino(inode), &ci->i_layout, off, &len, ci->i_truncate_seq, ci->i_truncate_size, &page, 1, 0); @@ -570,6 +611,47 @@ static u64 get_writepages_data_length(struct inode *inode, return end > start ? end - start : 0; } +/* + * do a synchronous write on N pages + */ +static int ceph_sync_writepages(struct ceph_fs_client *fsc, + struct ceph_vino vino, + struct ceph_file_layout *layout, + struct ceph_snap_context *snapc, + u64 off, u64 len, + u32 truncate_seq, u64 truncate_size, + struct timespec64 *mtime, + struct page **pages, int num_pages) +{ + struct ceph_osd_client *osdc = &fsc->client->osdc; + struct ceph_osd_request *req; + int rc = 0; + int page_align = off & ~PAGE_MASK; + + req = ceph_osdc_new_request(osdc, layout, vino, off, &len, 0, 1, + CEPH_OSD_OP_WRITE, CEPH_OSD_FLAG_WRITE, + snapc, truncate_seq, truncate_size, + true); + if (IS_ERR(req)) + return PTR_ERR(req); + + /* it may be a short write due to an object boundary */ + osd_req_op_extent_osd_data_pages(req, 0, pages, len, page_align, + false, false); + dout("writepages %llu~%llu (%llu bytes)\n", off, len, len); + + req->r_mtime = *mtime; + rc = ceph_osdc_start_request(osdc, req, true); + if (!rc) + rc = ceph_osdc_wait_request(osdc, req); + + ceph_osdc_put_request(req); + if (rc == 0) + rc = len; + dout("writepages result %d\n", rc); + return rc; +} + /* * Write a single page, but leave the page locked. * @@ -628,7 +710,7 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) set_bdi_congested(inode_to_bdi(inode), BLK_RW_ASYNC); set_page_writeback(page); - err = ceph_osdc_writepages(&fsc->client->osdc, ceph_vino(inode), + err = ceph_sync_writepages(fsc, ceph_vino(inode), &ci->i_layout, snapc, page_off, len, ceph_wbc.truncate_seq, ceph_wbc.truncate_size, diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 5a62dbd3f4c2..9d9f745b98a1 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -509,23 +509,6 @@ int ceph_osdc_call(struct ceph_osd_client *osdc, struct page *req_page, size_t req_len, struct page **resp_pages, size_t *resp_len); -extern int ceph_osdc_readpages(struct ceph_osd_client *osdc, - struct ceph_vino vino, - struct ceph_file_layout *layout, - u64 off, u64 *plen, - u32 truncate_seq, u64 truncate_size, - struct page **pages, int nr_pages, - int page_align); - -extern int ceph_osdc_writepages(struct ceph_osd_client *osdc, - struct ceph_vino vino, - struct ceph_file_layout *layout, - struct ceph_snap_context *sc, - u64 off, u64 len, - u32 truncate_seq, u64 truncate_size, - struct timespec64 *mtime, - struct page **pages, int nr_pages); - int ceph_osdc_copy_from(struct ceph_osd_client *osdc, u64 src_snapid, u64 src_version, struct ceph_object_id *src_oid, diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index b68b376d8c2f..8ff2856e2d52 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -5230,85 +5230,6 @@ void ceph_osdc_stop(struct ceph_osd_client *osdc) ceph_msgpool_destroy(&osdc->msgpool_op_reply); } -/* - * Read some contiguous pages. If we cross a stripe boundary, shorten - * *plen. Return number of bytes read, or error. - */ -int ceph_osdc_readpages(struct ceph_osd_client *osdc, - struct ceph_vino vino, struct ceph_file_layout *layout, - u64 off, u64 *plen, - u32 truncate_seq, u64 truncate_size, - struct page **pages, int num_pages, int page_align) -{ - struct ceph_osd_request *req; - int rc = 0; - - dout("readpages on ino %llx.%llx on %llu~%llu\n", vino.ino, - vino.snap, off, *plen); - req = ceph_osdc_new_request(osdc, layout, vino, off, plen, 0, 1, - CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ, - NULL, truncate_seq, truncate_size, - false); - if (IS_ERR(req)) - return PTR_ERR(req); - - /* it may be a short read due to an object boundary */ - osd_req_op_extent_osd_data_pages(req, 0, - pages, *plen, page_align, false, false); - - dout("readpages final extent is %llu~%llu (%llu bytes align %d)\n", - off, *plen, *plen, page_align); - - rc = ceph_osdc_start_request(osdc, req, false); - if (!rc) - rc = ceph_osdc_wait_request(osdc, req); - - ceph_osdc_put_request(req); - dout("readpages result %d\n", rc); - return rc; -} -EXPORT_SYMBOL(ceph_osdc_readpages); - -/* - * do a synchronous write on N pages - */ -int ceph_osdc_writepages(struct ceph_osd_client *osdc, struct ceph_vino vino, - struct ceph_file_layout *layout, - struct ceph_snap_context *snapc, - u64 off, u64 len, - u32 truncate_seq, u64 truncate_size, - struct timespec64 *mtime, - struct page **pages, int num_pages) -{ - struct ceph_osd_request *req; - int rc = 0; - int page_align = off & ~PAGE_MASK; - - req = ceph_osdc_new_request(osdc, layout, vino, off, &len, 0, 1, - CEPH_OSD_OP_WRITE, CEPH_OSD_FLAG_WRITE, - snapc, truncate_seq, truncate_size, - true); - if (IS_ERR(req)) - return PTR_ERR(req); - - /* it may be a short write due to an object boundary */ - osd_req_op_extent_osd_data_pages(req, 0, pages, len, page_align, - false, false); - dout("writepages %llu~%llu (%llu bytes)\n", off, len, len); - - req->r_mtime = *mtime; - rc = ceph_osdc_start_request(osdc, req, true); - if (!rc) - rc = ceph_osdc_wait_request(osdc, req); - - ceph_osdc_put_request(req); - if (rc == 0) - rc = len; - dout("writepages result %d\n", rc); - return rc; -} -EXPORT_SYMBOL(ceph_osdc_writepages); - static int osd_req_op_copy_from_init(struct ceph_osd_request *req, u64 src_snapid, u64 src_version, struct ceph_object_id *src_oid, From patchwork Tue Jan 28 13:02:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11354227 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6282E1580 for ; Tue, 28 Jan 2020 13:03:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 37B9B24683 for ; Tue, 28 Jan 2020 13:03:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GsT4sxSu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726111AbgA1NDO (ORCPT ); Tue, 28 Jan 2020 08:03:14 -0500 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:43546 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726057AbgA1NDO (ORCPT ); Tue, 28 Jan 2020 08:03:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580216593; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2eHit3CkVZuOqMCE8IObfEo3PPjXAINOXoY0VUMqcQE=; b=GsT4sxSu6ubi9cOgHVBFw4ob3BIimZIE6GgopBSjnK62y1UT1jxlp8n4HXnObQn8GhMbl+ CezUTt3UrNgzByWEYzlhqfKrh7CzywnwWREpd/f50YLWi5POhbGNUfx4eAXwHpVPPlRupi 3k6peBsV/zVG+FHdCm+BW1e15/L7XEc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-271-3ZXVUPuGNSeX2i1LwEah6g-1; Tue, 28 Jan 2020 08:03:11 -0500 X-MC-Unique: 3ZXVUPuGNSeX2i1LwEah6g-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4252F1800D41; Tue, 28 Jan 2020 13:03:10 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-34.pek2.redhat.com [10.72.12.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id 84BB788827; Tue, 28 Jan 2020 13:03:07 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com, zyan@redhat.com Cc: sage@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v5 03/10] ceph: add r_end_stamp for the osdc request Date: Tue, 28 Jan 2020 08:02:41 -0500 Message-Id: <20200128130248.4266-4-xiubli@redhat.com> In-Reply-To: <20200128130248.4266-1-xiubli@redhat.com> References: <20200128130248.4266-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Grab the osdc requests' end time stamp. Signed-off-by: Xiubo Li --- include/linux/ceph/osd_client.h | 1 + net/ceph/osd_client.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 9d9f745b98a1..00a449cfc478 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -213,6 +213,7 @@ struct ceph_osd_request { /* internal */ unsigned long r_stamp; /* jiffies, send or check time */ unsigned long r_start_stamp; /* jiffies */ + unsigned long r_end_stamp; /* jiffies */ int r_attempts; u32 r_map_dne_bound; diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index 8ff2856e2d52..108c9457d629 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -2389,6 +2389,8 @@ static void finish_request(struct ceph_osd_request *req) WARN_ON(lookup_request_mc(&osdc->map_checks, req->r_tid)); dout("%s req %p tid %llu\n", __func__, req, req->r_tid); + req->r_end_stamp = jiffies; + if (req->r_osd) unlink_request(req->r_osd, req); atomic_dec(&osdc->num_requests); From patchwork Tue Jan 28 13:02:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11354233 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B80B51580 for ; Tue, 28 Jan 2020 13:03:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 972E524683 for ; Tue, 28 Jan 2020 13:03:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QYqjHiCl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725959AbgA1NDc (ORCPT ); Tue, 28 Jan 2020 08:03:32 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:41260 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725283AbgA1NDb (ORCPT ); Tue, 28 Jan 2020 08:03:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580216609; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OEJgoMb5Tx5SEdINDkcdLjht8sZVW13I7Tzt6tP0GK0=; b=QYqjHiClPCpQoKzjM4iMRbd7nrJ9N3hVNwCwHJzyWBVC580ZDYe14CMXQ3e9ThCU3KR7ql z7W2JQhsPnbXk4rNJVFOpczUFTKWt8fiHdngMy8C5gAQBsXiU3iq5OtwiI5lLoK9HfwB7+ KjpeUR7zl5AUPmIWR7g8CU+dGfg1M+M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-364-v2CPQwmpN92B6Z4s1ZbbDA-1; Tue, 28 Jan 2020 08:03:15 -0500 X-MC-Unique: v2CPQwmpN92B6Z4s1ZbbDA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4ABF7107ACC4; Tue, 28 Jan 2020 13:03:14 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-34.pek2.redhat.com [10.72.12.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id CED5780A5C; Tue, 28 Jan 2020 13:03:10 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com, zyan@redhat.com Cc: sage@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v5 04/10] ceph: add global read latency metric support Date: Tue, 28 Jan 2020 08:02:42 -0500 Message-Id: <20200128130248.4266-5-xiubli@redhat.com> In-Reply-To: <20200128130248.4266-1-xiubli@redhat.com> References: <20200128130248.4266-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li item total sum_lat(us) avg_lat(us) ----------------------------------------------------- read 73 3590000 49178082 URL: https://tracker.ceph.com/issues/43215 Signed-off-by: Xiubo Li --- fs/ceph/addr.c | 8 ++++++++ fs/ceph/debugfs.c | 11 +++++++++++ fs/ceph/file.c | 15 +++++++++++++++ fs/ceph/mds_client.c | 29 +++++++++++++++++++++++------ fs/ceph/mds_client.h | 9 ++------- fs/ceph/metric.h | 30 ++++++++++++++++++++++++++++++ 6 files changed, 89 insertions(+), 13 deletions(-) create mode 100644 fs/ceph/metric.h diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 20e5ebfff389..0435a694370b 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -195,6 +195,7 @@ static int ceph_sync_readpages(struct ceph_fs_client *fsc, int page_align) { struct ceph_osd_client *osdc = &fsc->client->osdc; + struct ceph_client_metric *metric = &fsc->mdsc->metric; struct ceph_osd_request *req; int rc = 0; @@ -218,6 +219,8 @@ static int ceph_sync_readpages(struct ceph_fs_client *fsc, if (!rc) rc = ceph_osdc_wait_request(osdc, req); + ceph_update_read_latency(metric, req, rc); + ceph_osdc_put_request(req); dout("readpages result %d\n", rc); return rc; @@ -301,6 +304,8 @@ static int ceph_readpage(struct file *filp, struct page *page) static void finish_read(struct ceph_osd_request *req) { struct inode *inode = req->r_inode; + struct ceph_fs_client *fsc = ceph_inode_to_client(inode); + struct ceph_client_metric *metric = &fsc->mdsc->metric; struct ceph_osd_data *osd_data; int rc = req->r_result <= 0 ? req->r_result : 0; int bytes = req->r_result >= 0 ? req->r_result : 0; @@ -338,6 +343,9 @@ static void finish_read(struct ceph_osd_request *req) put_page(page); bytes -= PAGE_SIZE; } + + ceph_update_read_latency(metric, req, rc); + kfree(osd_data->pages); } diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c index c132fdb40d53..f8a32fa335ae 100644 --- a/fs/ceph/debugfs.c +++ b/fs/ceph/debugfs.c @@ -128,8 +128,19 @@ static int metric_show(struct seq_file *s, void *p) { struct ceph_fs_client *fsc = s->private; struct ceph_mds_client *mdsc = fsc->mdsc; + s64 total, sum, avg = 0; int i; + seq_printf(s, "item total sum_lat(us) avg_lat(us)\n"); + seq_printf(s, "-----------------------------------------------------\n"); + + total = percpu_counter_sum(&mdsc->metric.total_reads); + sum = percpu_counter_sum(&mdsc->metric.read_latency_sum); + avg = total ? sum / total : 0; + seq_printf(s, "%-14s%-12lld%-16lld%lld\n", "read", + total, sum / NSEC_PER_USEC, avg / NSEC_PER_USEC); + + seq_printf(s, "\n"); seq_printf(s, "item total miss hit\n"); seq_printf(s, "-------------------------------------------------\n"); diff --git a/fs/ceph/file.c b/fs/ceph/file.c index c78dfbbb7b91..69288c39229b 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -588,6 +588,7 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); + struct ceph_client_metric *metric = &fsc->mdsc->metric; struct ceph_osd_client *osdc = &fsc->client->osdc; ssize_t ret; u64 off = iocb->ki_pos; @@ -660,6 +661,9 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, ret = ceph_osdc_start_request(osdc, req, false); if (!ret) ret = ceph_osdc_wait_request(osdc, req); + + ceph_update_read_latency(metric, req, ret); + ceph_osdc_put_request(req); i_size = i_size_read(inode); @@ -798,13 +802,20 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req) struct inode *inode = req->r_inode; struct ceph_aio_request *aio_req = req->r_priv; struct ceph_osd_data *osd_data = osd_req_op_extent_osd_data(req, 0); + struct ceph_fs_client *fsc = ceph_inode_to_client(inode); + struct ceph_client_metric *metric = &fsc->mdsc->metric; BUG_ON(osd_data->type != CEPH_OSD_DATA_TYPE_BVECS); BUG_ON(!osd_data->num_bvecs); + BUG_ON(!aio_req); dout("ceph_aio_complete_req %p rc %d bytes %u\n", inode, rc, osd_data->bvec_pos.iter.bi_size); + /* r_start_stamp == 0 means the request was not submitted */ + if (req->r_start_stamp && !aio_req->write) + ceph_update_read_latency(metric, req, rc); + if (rc == -EOLDSNAPC) { struct ceph_aio_work *aio_work; BUG_ON(!aio_req->write); @@ -933,6 +944,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); + struct ceph_client_metric *metric = &fsc->mdsc->metric; struct ceph_vino vino; struct ceph_osd_request *req; struct bio_vec *bvecs; @@ -1049,6 +1061,9 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, if (!ret) ret = ceph_osdc_wait_request(&fsc->client->osdc, req); + if (!write) + ceph_update_read_latency(metric, req, ret); + size = i_size_read(inode); if (!write) { if (ret == -ENOENT) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 141c1c03636c..101b51f9f05d 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4182,14 +4182,29 @@ static int ceph_mdsc_metric_init(struct ceph_client_metric *metric) atomic64_set(&metric->total_dentries, 0); ret = percpu_counter_init(&metric->d_lease_hit, 0, GFP_KERNEL); if (ret) - return ret; + return ret;; ret = percpu_counter_init(&metric->d_lease_mis, 0, GFP_KERNEL); - if (ret) { - percpu_counter_destroy(&metric->d_lease_hit); - return ret; - } + if (ret) + goto err_dlease_mis; - return 0; + ret = percpu_counter_init(&metric->total_reads, 0, GFP_KERNEL); + if (ret) + goto err_total_reads; + + ret = percpu_counter_init(&metric->read_latency_sum, 0, GFP_KERNEL); + if (ret) + goto err_read_latency_sum; + + return ret; + +err_read_latency_sum: + percpu_counter_destroy(&metric->total_reads); +err_total_reads: + percpu_counter_destroy(&metric->d_lease_mis); +err_dlease_mis: + percpu_counter_destroy(&metric->d_lease_hit); + + return ret; } int ceph_mdsc_init(struct ceph_fs_client *fsc) @@ -4529,6 +4544,8 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc) ceph_mdsc_stop(mdsc); + percpu_counter_destroy(&mdsc->metric.read_latency_sum); + percpu_counter_destroy(&mdsc->metric.total_reads); percpu_counter_destroy(&mdsc->metric.d_lease_mis); percpu_counter_destroy(&mdsc->metric.d_lease_hit); diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index ba74ff74c59c..574d4e5a5de2 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -16,6 +16,8 @@ #include #include +#include "metric.h" + /* The first 8 bits are reserved for old ceph releases */ enum ceph_feature_type { CEPHFS_FEATURE_MIMIC = 8, @@ -361,13 +363,6 @@ struct cap_wait { int want; }; -/* This is the global metrics */ -struct ceph_client_metric { - atomic64_t total_dentries; - struct percpu_counter d_lease_hit; - struct percpu_counter d_lease_mis; -}; - /* * mds client state */ diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h new file mode 100644 index 000000000000..2a7b8f3fe6a4 --- /dev/null +++ b/fs/ceph/metric.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _FS_CEPH_MDS_METRIC_H +#define _FS_CEPH_MDS_METRIC_H + +#include + +/* This is the global metrics */ +struct ceph_client_metric { + atomic64_t total_dentries; + struct percpu_counter d_lease_hit; + struct percpu_counter d_lease_mis; + + struct percpu_counter total_reads; + struct percpu_counter read_latency_sum; +}; + +static inline void ceph_update_read_latency(struct ceph_client_metric *m, + struct ceph_osd_request *req, + int rc) +{ + if (!m || !req) + return; + + if (rc >= 0 || rc == -ENOENT || rc == -ETIMEDOUT) { + s64 latency = req->r_end_stamp - req->r_start_stamp; + percpu_counter_inc(&m->total_reads); + percpu_counter_add(&m->read_latency_sum, latency); + } +} +#endif From patchwork Tue Jan 28 13:02:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11354237 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 39472138C for ; Tue, 28 Jan 2020 13:03:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0E23E24683 for ; Tue, 28 Jan 2020 13:03:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="IESvyoNS" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726063AbgA1NDk (ORCPT ); Tue, 28 Jan 2020 08:03:40 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:48998 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726007AbgA1NDk (ORCPT ); Tue, 28 Jan 2020 08:03:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580216618; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=19M73t7HmnYSsqP0+igGXGkeLAWlntz+rlHlLYCfMdI=; b=IESvyoNSuRbm5G4VCMbWeGePdcmlusy2n/NXG82mVeuK24ZBZZexXDUnP2AouE9HeCOwZV UF7Zlp9GG9zwWz1Ez0xH+qDSc5Hx3heIJBHSDY6UfkR3w0WJxmwTw5D/HyG0lQjAdMjbGV YgOPy0L5Mh1hKrKMlRKx/YRRNEIbWnE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-298-0TIP8JVgOiucngXCQjQF-g-1; Tue, 28 Jan 2020 08:03:23 -0500 X-MC-Unique: 0TIP8JVgOiucngXCQjQF-g-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 35CFC8010C4; Tue, 28 Jan 2020 13:03:22 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-34.pek2.redhat.com [10.72.12.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id D3F94863D6; Tue, 28 Jan 2020 13:03:14 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com, zyan@redhat.com Cc: sage@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v5 05/10] ceph: add global write latency metric support Date: Tue, 28 Jan 2020 08:02:43 -0500 Message-Id: <20200128130248.4266-6-xiubli@redhat.com> In-Reply-To: <20200128130248.4266-1-xiubli@redhat.com> References: <20200128130248.4266-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li item total sum_lat(us) avg_lat(us) ----------------------------------------------------- write 222 5287750000 23818693 URL: https://tracker.ceph.com/issues/43215 Signed-off-by: Xiubo Li --- fs/ceph/addr.c | 10 ++++++++++ fs/ceph/debugfs.c | 6 ++++++ fs/ceph/file.c | 14 +++++++++++--- fs/ceph/mds_client.c | 15 ++++++++++++++- fs/ceph/metric.h | 17 +++++++++++++++++ 5 files changed, 58 insertions(+), 4 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 0435a694370b..74868231f007 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -632,6 +632,7 @@ static int ceph_sync_writepages(struct ceph_fs_client *fsc, struct page **pages, int num_pages) { struct ceph_osd_client *osdc = &fsc->client->osdc; + struct ceph_client_metric *metric = &fsc->mdsc->metric; struct ceph_osd_request *req; int rc = 0; int page_align = off & ~PAGE_MASK; @@ -653,6 +654,8 @@ static int ceph_sync_writepages(struct ceph_fs_client *fsc, if (!rc) rc = ceph_osdc_wait_request(osdc, req); + ceph_update_write_latency(metric, req, rc); + ceph_osdc_put_request(req); if (rc == 0) rc = len; @@ -792,6 +795,7 @@ static void writepages_finish(struct ceph_osd_request *req) struct ceph_snap_context *snapc = req->r_snapc; struct address_space *mapping = inode->i_mapping; struct ceph_fs_client *fsc = ceph_inode_to_client(inode); + struct ceph_client_metric *metric = &fsc->mdsc->metric; bool remove_page; dout("writepages_finish %p rc %d\n", inode, rc); @@ -804,6 +808,8 @@ static void writepages_finish(struct ceph_osd_request *req) ceph_clear_error_write(ci); } + ceph_update_write_latency(metric, req, rc); + /* * We lost the cache cap, need to truncate the page before * it is unlocked, otherwise we'd truncate it later in the @@ -1752,6 +1758,7 @@ int ceph_uninline_data(struct file *filp, struct page *locked_page) struct inode *inode = file_inode(filp); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); + struct ceph_client_metric *metric = &fsc->mdsc->metric; struct ceph_osd_request *req; struct page *page = NULL; u64 len, inline_version; @@ -1864,6 +1871,9 @@ int ceph_uninline_data(struct file *filp, struct page *locked_page) err = ceph_osdc_start_request(&fsc->client->osdc, req, false); if (!err) err = ceph_osdc_wait_request(&fsc->client->osdc, req); + + ceph_update_write_latency(metric, req, err); + out_put: ceph_osdc_put_request(req); if (err == -ECANCELED) diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c index f8a32fa335ae..3d27f2e6f556 100644 --- a/fs/ceph/debugfs.c +++ b/fs/ceph/debugfs.c @@ -140,6 +140,12 @@ static int metric_show(struct seq_file *s, void *p) seq_printf(s, "%-14s%-12lld%-16lld%lld\n", "read", total, sum / NSEC_PER_USEC, avg / NSEC_PER_USEC); + total = percpu_counter_sum(&mdsc->metric.total_writes); + sum = percpu_counter_sum(&mdsc->metric.write_latency_sum); + avg = total ? sum / total : 0; + seq_printf(s, "%-14s%-12lld%-16lld%lld\n", "write", + total, sum / NSEC_PER_USEC, avg / NSEC_PER_USEC); + seq_printf(s, "\n"); seq_printf(s, "item total miss hit\n"); seq_printf(s, "-------------------------------------------------\n"); diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 69288c39229b..9940eb85eff6 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -813,8 +813,12 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req) inode, rc, osd_data->bvec_pos.iter.bi_size); /* r_start_stamp == 0 means the request was not submitted */ - if (req->r_start_stamp && !aio_req->write) - ceph_update_read_latency(metric, req, rc); + if (req->r_start_stamp) { + if (aio_req->write) + ceph_update_write_latency(metric, req, rc); + else + ceph_update_read_latency(metric, req, rc); + } if (rc == -EOLDSNAPC) { struct ceph_aio_work *aio_work; @@ -1061,7 +1065,9 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, if (!ret) ret = ceph_osdc_wait_request(&fsc->client->osdc, req); - if (!write) + if (write) + ceph_update_write_latency(metric, req, ret); + else ceph_update_read_latency(metric, req, ret); size = i_size_read(inode); @@ -1150,6 +1156,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); + struct ceph_client_metric *metric = &fsc->mdsc->metric; struct ceph_vino vino; struct ceph_osd_request *req; struct page **pages; @@ -1235,6 +1242,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, if (!ret) ret = ceph_osdc_wait_request(&fsc->client->osdc, req); + ceph_update_write_latency(metric, req, ret); out: ceph_osdc_put_request(req); if (ret != 0) { diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 101b51f9f05d..d072cab77ab2 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4195,8 +4195,19 @@ static int ceph_mdsc_metric_init(struct ceph_client_metric *metric) if (ret) goto err_read_latency_sum; - return ret; + ret = percpu_counter_init(&metric->total_writes, 0, GFP_KERNEL); + if (ret) + goto err_total_writes; + ret = percpu_counter_init(&metric->write_latency_sum, 0, GFP_KERNEL); + if (ret) + goto err_write_latency_sum; + + return ret; +err_write_latency_sum: + percpu_counter_destroy(&metric->total_writes); +err_total_writes: + percpu_counter_destroy(&metric->read_latency_sum); err_read_latency_sum: percpu_counter_destroy(&metric->total_reads); err_total_reads: @@ -4544,6 +4555,8 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc) ceph_mdsc_stop(mdsc); + percpu_counter_destroy(&mdsc->metric.write_latency_sum); + percpu_counter_destroy(&mdsc->metric.total_writes); percpu_counter_destroy(&mdsc->metric.read_latency_sum); percpu_counter_destroy(&mdsc->metric.total_reads); percpu_counter_destroy(&mdsc->metric.d_lease_mis); diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h index 2a7b8f3fe6a4..49546961eeed 100644 --- a/fs/ceph/metric.h +++ b/fs/ceph/metric.h @@ -12,6 +12,9 @@ struct ceph_client_metric { struct percpu_counter total_reads; struct percpu_counter read_latency_sum; + + struct percpu_counter total_writes; + struct percpu_counter write_latency_sum; }; static inline void ceph_update_read_latency(struct ceph_client_metric *m, @@ -27,4 +30,18 @@ static inline void ceph_update_read_latency(struct ceph_client_metric *m, percpu_counter_add(&m->read_latency_sum, latency); } } + +static inline void ceph_update_write_latency(struct ceph_client_metric *m, + struct ceph_osd_request *req, + int rc) +{ + if (!m || !req) + return; + + if (!rc || rc == -ETIMEDOUT) { + s64 latency = req->r_end_stamp - req->r_start_stamp; + percpu_counter_inc(&m->total_writes); + percpu_counter_add(&m->write_latency_sum, latency); + } +} #endif From patchwork Tue Jan 28 13:02:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11354231 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 772F4138C for ; Tue, 28 Jan 2020 13:03:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5601D24683 for ; Tue, 28 Jan 2020 13:03:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Sf3Wuqxg" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725937AbgA1NDb (ORCPT ); Tue, 28 Jan 2020 08:03:31 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:40465 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725852AbgA1NDb (ORCPT ); Tue, 28 Jan 2020 08:03:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580216610; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pZyMP45G5kXP8v419o76iOuhS2F62vcSL96XsFjFBMQ=; b=Sf3WuqxgLFyKW9rFCuUScSYCPqzaOEqG8VFmflL8BZ+qtZRhGE0hCoV5vNlkMTRZKIiLGB Dv/LOGb6fYEH+LrBQr52smcGHJSWeWSTaEM0Pty3UU/Svf0xGoW8iqDxpeEZrS3onVu/A9 3b86lvf7djwzuGpBkUdVIniliQlzF30= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-379-gmAWtRz8MLOFHw-9l2y7IQ-1; Tue, 28 Jan 2020 08:03:27 -0500 X-MC-Unique: gmAWtRz8MLOFHw-9l2y7IQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6A199107ACCA; Tue, 28 Jan 2020 13:03:26 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-34.pek2.redhat.com [10.72.12.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id C21FD863D3; Tue, 28 Jan 2020 13:03:22 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com, zyan@redhat.com Cc: sage@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v5 06/10] ceph: add global metadata perf metric support Date: Tue, 28 Jan 2020 08:02:44 -0500 Message-Id: <20200128130248.4266-7-xiubli@redhat.com> In-Reply-To: <20200128130248.4266-1-xiubli@redhat.com> References: <20200128130248.4266-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li item total sum_lat(us) avg_lat(us) ----------------------------------------------------- metadata 1288 24506000 19026 URL: https://tracker.ceph.com/issues/43215 Signed-off-by: Xiubo Li --- fs/ceph/debugfs.c | 6 ++++++ fs/ceph/mds_client.c | 25 +++++++++++++++++++++++++ fs/ceph/metric.h | 13 +++++++++++++ 3 files changed, 44 insertions(+) diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c index 3d27f2e6f556..7fd031c18309 100644 --- a/fs/ceph/debugfs.c +++ b/fs/ceph/debugfs.c @@ -146,6 +146,12 @@ static int metric_show(struct seq_file *s, void *p) seq_printf(s, "%-14s%-12lld%-16lld%lld\n", "write", total, sum / NSEC_PER_USEC, avg / NSEC_PER_USEC); + total = percpu_counter_sum(&mdsc->metric.total_metadatas); + sum = percpu_counter_sum(&mdsc->metric.metadata_latency_sum); + avg = total ? sum / total : 0; + seq_printf(s, "%-14s%-12lld%-16lld%lld\n", "metadata", + total, sum / NSEC_PER_USEC, avg / NSEC_PER_USEC); + seq_printf(s, "\n"); seq_printf(s, "item total miss hit\n"); seq_printf(s, "-------------------------------------------------\n"); diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index d072cab77ab2..92a933810a79 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2772,6 +2772,12 @@ static int ceph_mdsc_wait_request(struct ceph_mds_client *mdsc, else err = timeleft; /* killed */ } + + if (!err || err == -EIO) { + s64 latency = jiffies - req->r_started; + ceph_update_metadata_latency(&mdsc->metric, latency); + } + dout("do_request waited, got %d\n", err); mutex_lock(&mdsc->mutex); @@ -3033,6 +3039,11 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) /* kick calling process */ complete_request(mdsc, req); + + if (!result || result == -ENOENT) { + s64 latency = jiffies - req->r_started; + ceph_update_metadata_latency(&mdsc->metric, latency); + } out: ceph_mdsc_put_request(req); return; @@ -4203,7 +4214,19 @@ static int ceph_mdsc_metric_init(struct ceph_client_metric *metric) if (ret) goto err_write_latency_sum; + ret = percpu_counter_init(&metric->total_metadatas, 0, GFP_KERNEL); + if (ret) + goto err_total_metadatas; + + ret = percpu_counter_init(&metric->metadata_latency_sum, 0, GFP_KERNEL); + if (ret) + goto err_metadata_latency_sum; + return ret; +err_metadata_latency_sum: + percpu_counter_destroy(&metric->total_metadatas); +err_total_metadatas: + percpu_counter_destroy(&metric->write_latency_sum); err_write_latency_sum: percpu_counter_destroy(&metric->total_writes); err_total_writes: @@ -4555,6 +4578,8 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc) ceph_mdsc_stop(mdsc); + percpu_counter_destroy(&mdsc->metric.metadata_latency_sum); + percpu_counter_destroy(&mdsc->metric.total_metadatas); percpu_counter_destroy(&mdsc->metric.write_latency_sum); percpu_counter_destroy(&mdsc->metric.total_writes); percpu_counter_destroy(&mdsc->metric.read_latency_sum); diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h index 49546961eeed..3cda616ba594 100644 --- a/fs/ceph/metric.h +++ b/fs/ceph/metric.h @@ -15,6 +15,9 @@ struct ceph_client_metric { struct percpu_counter total_writes; struct percpu_counter write_latency_sum; + + struct percpu_counter total_metadatas; + struct percpu_counter metadata_latency_sum; }; static inline void ceph_update_read_latency(struct ceph_client_metric *m, @@ -44,4 +47,14 @@ static inline void ceph_update_write_latency(struct ceph_client_metric *m, percpu_counter_add(&m->write_latency_sum, latency); } } + +static inline void ceph_update_metadata_latency(struct ceph_client_metric *m, + s64 latency) +{ + if (!m) + return; + + percpu_counter_inc(&m->total_metadatas); + percpu_counter_add(&m->metadata_latency_sum, latency); +} #endif From patchwork Tue Jan 28 13:02:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11354235 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1C479112B for ; Tue, 28 Jan 2020 13:03:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DBEEF24683 for ; Tue, 28 Jan 2020 13:03:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GeH9UfcZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726024AbgA1NDh (ORCPT ); Tue, 28 Jan 2020 08:03:37 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:56756 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726007AbgA1NDg (ORCPT ); Tue, 28 Jan 2020 08:03:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580216615; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d+NR+Fs0tQQ9BCTWLeWCgTCXZbkoa6B/pvtc7Oo9goo=; b=GeH9UfcZ+A/E8XaHeDoqWOlqVd/q8cW2kjRISpfSAajqiDLju8JF2y9OxYUHR36arOHAFp RbrIGhpqqWcSlA21QX2UIgjNYYWV9I2xtEGvzSuN2ib1Z2i+k4iDbYRkMU1oERP/aCk7Ki 4kBIEUwNBDiTfZ77+aGYsUFYyU0C5C8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-211-2CkA1oimM7K9AvoOpR-9QA-1; Tue, 28 Jan 2020 08:03:33 -0500 X-MC-Unique: 2CkA1oimM7K9AvoOpR-9QA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E135810054E3; Tue, 28 Jan 2020 13:03:31 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-34.pek2.redhat.com [10.72.12.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id 01BCF80A5C; Tue, 28 Jan 2020 13:03:26 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com, zyan@redhat.com Cc: sage@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v5 07/10] ceph: periodically send perf metrics to MDS Date: Tue, 28 Jan 2020 08:02:45 -0500 Message-Id: <20200128130248.4266-8-xiubli@redhat.com> In-Reply-To: <20200128130248.4266-1-xiubli@redhat.com> References: <20200128130248.4266-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Add enable/disable sending metrics to MDS debugfs and disabled as default, if it's enabled the kclient will send metrics every second. This will send global dentry lease hit/miss and read/write/metadata latency metrics and each session's caps hit/miss metric to MDS. Every time only sends the global metrics once via any availible session. URL: https://tracker.ceph.com/issues/43215 Signed-off-by: Xiubo Li --- fs/ceph/debugfs.c | 44 +++++++- fs/ceph/mds_client.c | 201 ++++++++++++++++++++++++++++++++--- fs/ceph/mds_client.h | 3 + fs/ceph/metric.h | 76 +++++++++++++ fs/ceph/super.h | 1 + include/linux/ceph/ceph_fs.h | 1 + 6 files changed, 307 insertions(+), 19 deletions(-) diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c index 7fd031c18309..8aae7ecea54a 100644 --- a/fs/ceph/debugfs.c +++ b/fs/ceph/debugfs.c @@ -124,6 +124,40 @@ static int mdsc_show(struct seq_file *s, void *p) return 0; } +/* + * metrics debugfs + */ +static int sending_metrics_set(void *data, u64 val) +{ + struct ceph_fs_client *fsc = (struct ceph_fs_client *)data; + struct ceph_mds_client *mdsc = fsc->mdsc; + + if (val > 1) { + pr_err("Invalid sending metrics set value %llu\n", val); + return -EINVAL; + } + + mutex_lock(&mdsc->mutex); + mdsc->sending_metrics = (unsigned int)val; + mutex_unlock(&mdsc->mutex); + + return 0; +} + +static int sending_metrics_get(void *data, u64 *val) +{ + struct ceph_fs_client *fsc = (struct ceph_fs_client *)data; + struct ceph_mds_client *mdsc = fsc->mdsc; + + mutex_lock(&mdsc->mutex); + *val = (u64)mdsc->sending_metrics; + mutex_unlock(&mdsc->mutex); + + return 0; +} +DEFINE_SIMPLE_ATTRIBUTE(sending_metrics_fops, sending_metrics_get, + sending_metrics_set, "%llu\n"); + static int metric_show(struct seq_file *s, void *p) { struct ceph_fs_client *fsc = s->private; @@ -302,11 +336,9 @@ static int congestion_kb_get(void *data, u64 *val) *val = (u64)fsc->mount_options->congestion_kb; return 0; } - DEFINE_SIMPLE_ATTRIBUTE(congestion_kb_fops, congestion_kb_get, congestion_kb_set, "%llu\n"); - void ceph_fs_debugfs_cleanup(struct ceph_fs_client *fsc) { dout("ceph_fs_debugfs_cleanup\n"); @@ -316,6 +348,7 @@ void ceph_fs_debugfs_cleanup(struct ceph_fs_client *fsc) debugfs_remove(fsc->debugfs_mds_sessions); debugfs_remove(fsc->debugfs_caps); debugfs_remove(fsc->debugfs_metric); + debugfs_remove(fsc->debugfs_sending_metrics); debugfs_remove(fsc->debugfs_mdsc); } @@ -356,6 +389,13 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc) fsc, &mdsc_show_fops); + fsc->debugfs_sending_metrics = + debugfs_create_file("sending_metrics", + 0600, + fsc->client->debugfs_dir, + fsc, + &sending_metrics_fops); + fsc->debugfs_metric = debugfs_create_file("metrics", 0400, fsc->client->debugfs_dir, diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 92a933810a79..d765804dc855 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4104,13 +4104,156 @@ static void maybe_recover_session(struct ceph_mds_client *mdsc) ceph_force_reconnect(fsc->sb); } +/* + * called under s_mutex + */ +static bool ceph_mdsc_send_metrics(struct ceph_mds_client *mdsc, + struct ceph_mds_session *s, + bool skip_global) +{ + struct ceph_metric_head *head; + struct ceph_metric_cap *cap; + struct ceph_metric_dentry_lease *lease; + struct ceph_metric_read_latency *read; + struct ceph_metric_write_latency *write; + struct ceph_metric_metadata_latency *meta; + struct ceph_msg *msg; + struct timespec64 ts; + s32 len = sizeof(*head) + sizeof(*cap); + s64 sum, total, avg; + s32 items = 0; + + if (!mdsc || !s) + return false; + + if (!skip_global) { + len += sizeof(*lease); + len += sizeof(*read); + len += sizeof(*write); + len += sizeof(*meta); + } + + msg = ceph_msg_new(CEPH_MSG_CLIENT_METRICS, len, GFP_NOFS, true); + if (!msg) { + pr_err("send metrics to mds%d, failed to allocate message\n", + s->s_mds); + return false; + } + + head = msg->front.iov_base; + + /* encode the cap metric */ + cap = (struct ceph_metric_cap *)(head + 1); + cap->type = cpu_to_le32(CLIENT_METRIC_TYPE_CAP_INFO); + cap->ver = 1; + cap->campat = 1; + cap->data_len = cpu_to_le32(sizeof(*cap) - 10); + cap->hit = cpu_to_le64(percpu_counter_sum(&s->i_caps_hit)); + cap->mis = cpu_to_le64(percpu_counter_sum(&s->i_caps_mis)); + cap->total = cpu_to_le64(s->s_nr_caps); + items++; + + dout("cap metric hit %lld, mis %lld, total caps %lld", + le64_to_cpu(cap->hit), le64_to_cpu(cap->mis), + le64_to_cpu(cap->total)); + + /* only send the global once */ + if (skip_global) + goto skip_global; + + /* encode the dentry lease metric */ + lease = (struct ceph_metric_dentry_lease *)(cap + 1); + lease->type = cpu_to_le32(CLIENT_METRIC_TYPE_DENTRY_LEASE); + lease->ver = 1; + lease->campat = 1; + lease->data_len = cpu_to_le32(sizeof(*lease) - 10); + lease->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_hit)); + lease->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_mis)); + lease->total = cpu_to_le64(atomic64_read(&mdsc->metric.total_dentries)); + items++; + + dout("dentry lease metric hit %lld, mis %lld, total dentries %lld", + le64_to_cpu(lease->hit), le64_to_cpu(lease->mis), + le64_to_cpu(lease->total)); + + /* encode the read latency metric */ + read = (struct ceph_metric_read_latency *)(lease + 1); + read->type = cpu_to_le32(CLIENT_METRIC_TYPE_READ_LATENCY); + read->ver = 1; + read->campat = 1; + read->data_len = cpu_to_le32(sizeof(*read) - 10); + total = percpu_counter_sum(&mdsc->metric.total_reads), + sum = percpu_counter_sum(&mdsc->metric.read_latency_sum); + avg = total ? sum / total : 0; + ts = ns_to_timespec64(avg); + read->sec = cpu_to_le32(ts.tv_sec); + read->nsec = cpu_to_le32(ts.tv_nsec); + items++; + + dout("read latency metric total %lld, sum lat %lld, avg lat %lld", + total, sum, avg); + + /* encode the write latency metric */ + write = (struct ceph_metric_write_latency *)(read + 1); + write->type = cpu_to_le32(CLIENT_METRIC_TYPE_WRITE_LATENCY); + write->ver = 1; + write->campat = 1; + write->data_len = cpu_to_le32(sizeof(*write) - 10); + total = percpu_counter_sum(&mdsc->metric.total_writes), + sum = percpu_counter_sum(&mdsc->metric.write_latency_sum); + avg = total ? sum / total : 0; + ts = ns_to_timespec64(avg); + write->sec = cpu_to_le32(ts.tv_sec); + write->nsec = cpu_to_le32(ts.tv_nsec); + items++; + + dout("write latency metric total %lld, sum lat %lld, avg lat %lld", + total, sum, avg); + + /* encode the metadata latency metric */ + meta = (struct ceph_metric_metadata_latency *)(write + 1); + meta->type = cpu_to_le32(CLIENT_METRIC_TYPE_METADATA_LATENCY); + meta->ver = 1; + meta->campat = 1; + meta->data_len = cpu_to_le32(sizeof(*meta) - 10); + total = percpu_counter_sum(&mdsc->metric.total_metadatas), + sum = percpu_counter_sum(&mdsc->metric.metadata_latency_sum); + avg = total ? sum / total : 0; + ts = ns_to_timespec64(avg); + meta->sec = cpu_to_le32(ts.tv_sec); + meta->nsec = cpu_to_le32(ts.tv_nsec); + items++; + + dout("metadata latency metric total %lld, sum lat %lld, avg lat %lld", + total, sum, avg); + +skip_global: + put_unaligned_le32(items, &head->num); + msg->front.iov_len = cpu_to_le32(len); + msg->hdr.version = cpu_to_le16(1); + msg->hdr.compat_version = cpu_to_le16(1); + msg->hdr.front_len = cpu_to_le32(msg->front.iov_len); + dout("send metrics to mds%d %p\n", s->s_mds, msg); + ceph_con_send(&s->s_con, msg); + + return true; +} + /* * delayed work -- periodically trim expired leases, renew caps with mds */ +#define CEPH_WORK_DELAY_DEF 5 static void schedule_delayed(struct ceph_mds_client *mdsc) { - int delay = 5; - unsigned hz = round_jiffies_relative(HZ * delay); + unsigned int hz; + int delay = CEPH_WORK_DELAY_DEF; + + mutex_lock(&mdsc->mutex); + if (mdsc->sending_metrics) + delay = 1; + mutex_unlock(&mdsc->mutex); + + hz = round_jiffies_relative(HZ * delay); schedule_delayed_work(&mdsc->delayed_work, hz); } @@ -4121,18 +4264,28 @@ static void delayed_work(struct work_struct *work) container_of(work, struct ceph_mds_client, delayed_work.work); int renew_interval; int renew_caps; + bool metric_only; + bool sending_metrics; + bool g_skip = false; dout("mdsc delayed_work\n"); mutex_lock(&mdsc->mutex); - renew_interval = mdsc->mdsmap->m_session_timeout >> 2; - renew_caps = time_after_eq(jiffies, HZ*renew_interval + - mdsc->last_renew_caps); - if (renew_caps) - mdsc->last_renew_caps = jiffies; + sending_metrics = !!mdsc->sending_metrics; + metric_only = mdsc->sending_metrics && + (mdsc->ticks++ % CEPH_WORK_DELAY_DEF); + + if (!metric_only) { + renew_interval = mdsc->mdsmap->m_session_timeout >> 2; + renew_caps = time_after_eq(jiffies, HZ*renew_interval + + mdsc->last_renew_caps); + if (renew_caps) + mdsc->last_renew_caps = jiffies; + } for (i = 0; i < mdsc->max_sessions; i++) { struct ceph_mds_session *s = __ceph_lookup_mds_session(mdsc, i); + if (!s) continue; if (s->s_state == CEPH_MDS_SESSION_CLOSING) { @@ -4158,13 +4311,20 @@ static void delayed_work(struct work_struct *work) mutex_unlock(&mdsc->mutex); mutex_lock(&s->s_mutex); - if (renew_caps) - send_renew_caps(mdsc, s); - else - ceph_con_keepalive(&s->s_con); - if (s->s_state == CEPH_MDS_SESSION_OPEN || - s->s_state == CEPH_MDS_SESSION_HUNG) - ceph_send_cap_releases(mdsc, s); + + if (sending_metrics) + g_skip = ceph_mdsc_send_metrics(mdsc, s, g_skip); + + if (!metric_only) { + if (renew_caps) + send_renew_caps(mdsc, s); + else + ceph_con_keepalive(&s->s_con); + if (s->s_state == CEPH_MDS_SESSION_OPEN || + s->s_state == CEPH_MDS_SESSION_HUNG) + ceph_send_cap_releases(mdsc, s); + } + mutex_unlock(&s->s_mutex); ceph_put_mds_session(s); @@ -4172,6 +4332,9 @@ static void delayed_work(struct work_struct *work) } mutex_unlock(&mdsc->mutex); + if (metric_only) + goto delay_work; + ceph_check_delayed_caps(mdsc); ceph_queue_cap_reclaim_work(mdsc); @@ -4180,11 +4343,13 @@ static void delayed_work(struct work_struct *work) maybe_recover_session(mdsc); +delay_work: schedule_delayed(mdsc); } -static int ceph_mdsc_metric_init(struct ceph_client_metric *metric) +static int ceph_mdsc_metric_init(struct ceph_mds_client *mdsc) { + struct ceph_client_metric *metric = &mdsc->metric; int ret; if (!metric) @@ -4222,7 +4387,9 @@ static int ceph_mdsc_metric_init(struct ceph_client_metric *metric) if (ret) goto err_metadata_latency_sum; - return ret; + mdsc->sending_metrics = 0; + mdsc->ticks = 0; + return 0; err_metadata_latency_sum: percpu_counter_destroy(&metric->total_metadatas); err_total_metadatas: @@ -4294,7 +4461,7 @@ int ceph_mdsc_init(struct ceph_fs_client *fsc) init_waitqueue_head(&mdsc->cap_flushing_wq); INIT_WORK(&mdsc->cap_reclaim_work, ceph_cap_reclaim_work); atomic_set(&mdsc->cap_reclaim_pending, 0); - err = ceph_mdsc_metric_init(&mdsc->metric); + err = ceph_mdsc_metric_init(mdsc); if (err) goto err_mdsmap; diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 574d4e5a5de2..a0ece55d987c 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -451,6 +451,9 @@ struct ceph_mds_client { struct list_head dentry_leases; /* fifo list */ struct list_head dentry_dir_leases; /* lru list */ + /* metrics */ + unsigned int sending_metrics; + unsigned int ticks; struct ceph_client_metric metric; spinlock_t snapid_map_lock; diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h index 3cda616ba594..352eb753ce25 100644 --- a/fs/ceph/metric.h +++ b/fs/ceph/metric.h @@ -4,6 +4,82 @@ #include +enum ceph_metric_type { + CLIENT_METRIC_TYPE_CAP_INFO, + CLIENT_METRIC_TYPE_READ_LATENCY, + CLIENT_METRIC_TYPE_WRITE_LATENCY, + CLIENT_METRIC_TYPE_METADATA_LATENCY, + CLIENT_METRIC_TYPE_DENTRY_LEASE, + + CLIENT_METRIC_TYPE_MAX = CLIENT_METRIC_TYPE_DENTRY_LEASE, +}; + +/* metric caps header */ +struct ceph_metric_cap { + __le32 type; /* ceph metric type */ + + __u8 ver; + __u8 campat; + + __le32 data_len; /* length of sizeof(hit + mis + total) */ + __le64 hit; + __le64 mis; + __le64 total; +} __attribute__ ((packed)); + +/* metric dentry lease header */ +struct ceph_metric_dentry_lease { + __le32 type; /* ceph metric type */ + + __u8 ver; + __u8 campat; + + __le32 data_len; /* length of sizeof(hit + mis + total) */ + __le64 hit; + __le64 mis; + __le64 total; +} __attribute__ ((packed)); + +/* metric read latency header */ +struct ceph_metric_read_latency { + __le32 type; /* ceph metric type */ + + __u8 ver; + __u8 campat; + + __le32 data_len; /* length of sizeof(sec + nsec) */ + __le32 sec; + __le32 nsec; +} __attribute__ ((packed)); + +/* metric write latency header */ +struct ceph_metric_write_latency { + __le32 type; /* ceph metric type */ + + __u8 ver; + __u8 campat; + + __le32 data_len; /* length of sizeof(sec + nsec) */ + __le32 sec; + __le32 nsec; +} __attribute__ ((packed)); + +/* metric metadata latency header */ +struct ceph_metric_metadata_latency { + __le32 type; /* ceph metric type */ + + __u8 ver; + __u8 campat; + + __le32 data_len; /* length of sizeof(sec + nsec) */ + __le32 sec; + __le32 nsec; +} __attribute__ ((packed)); + +struct ceph_metric_head { + __le32 num; /* the number of metrics will be sent */ +} __attribute__ ((packed)); + /* This is the global metrics */ struct ceph_client_metric { atomic64_t total_dentries; diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 3f4829222528..a91431e9bdf7 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -128,6 +128,7 @@ struct ceph_fs_client { struct dentry *debugfs_congestion_kb; struct dentry *debugfs_bdi; struct dentry *debugfs_mdsc, *debugfs_mdsmap; + struct dentry *debugfs_sending_metrics; struct dentry *debugfs_metric; struct dentry *debugfs_mds_sessions; #endif diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h index a099f60feb7b..6028d3e865e4 100644 --- a/include/linux/ceph/ceph_fs.h +++ b/include/linux/ceph/ceph_fs.h @@ -130,6 +130,7 @@ struct ceph_dir_layout { #define CEPH_MSG_CLIENT_REQUEST 24 #define CEPH_MSG_CLIENT_REQUEST_FORWARD 25 #define CEPH_MSG_CLIENT_REPLY 26 +#define CEPH_MSG_CLIENT_METRICS 29 #define CEPH_MSG_CLIENT_CAPS 0x310 #define CEPH_MSG_CLIENT_LEASE 0x311 #define CEPH_MSG_CLIENT_SNAP 0x312 From patchwork Tue Jan 28 13:02:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11354239 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 78ED9138C for ; Tue, 28 Jan 2020 13:03:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4B5CB24683 for ; Tue, 28 Jan 2020 13:03:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="I1K2DQvF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726141AbgA1NDp (ORCPT ); Tue, 28 Jan 2020 08:03:45 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:40408 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726007AbgA1NDp (ORCPT ); Tue, 28 Jan 2020 08:03:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580216624; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CA1BOrINbemf/bjV6vz25R4YawJkALrPImty+Lbmlko=; b=I1K2DQvF1w6Ax7mDnUcHrJyClXDm28Wlcqo8pZOZSVonBcOJVgr3noPB/DGh8Oksoi6yEl 5iF5xWDgeHkY+/7KirR/zqRewwaovZrIRwu+23HeAJh76XejM5RoW82t08L/IhHDTnmum3 7h5yryG0VDrEad+qDVoKXRx3YOdl53c= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-158-U4Z6GEQyNJWMjPwrclqG8Q-1; Tue, 28 Jan 2020 08:03:41 -0500 X-MC-Unique: U4Z6GEQyNJWMjPwrclqG8Q-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C01B818A6EC7; Tue, 28 Jan 2020 13:03:40 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-34.pek2.redhat.com [10.72.12.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id 79318863D3; Tue, 28 Jan 2020 13:03:32 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com, zyan@redhat.com Cc: sage@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v5 08/10] ceph: add CEPH_DEFINE_RW_FUNC helper support Date: Tue, 28 Jan 2020 08:02:46 -0500 Message-Id: <20200128130248.4266-9-xiubli@redhat.com> In-Reply-To: <20200128130248.4266-1-xiubli@redhat.com> References: <20200128130248.4266-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li This will support the string store. Signed-off-by: Xiubo Li --- include/linux/ceph/debugfs.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/include/linux/ceph/debugfs.h b/include/linux/ceph/debugfs.h index cf5e840eec71..b918fb1f6f54 100644 --- a/include/linux/ceph/debugfs.h +++ b/include/linux/ceph/debugfs.h @@ -18,6 +18,20 @@ static const struct file_operations name##_fops = { \ .release = single_release, \ }; +#define CEPH_DEFINE_RW_FUNC(name) \ +static int name##_open(struct inode *inode, struct file *file) \ +{ \ + return single_open(file, name##_show, inode->i_private); \ +} \ + \ +static const struct file_operations name##_fops = { \ + .open = name##_open, \ + .read = seq_read, \ + .write = name##_store, \ + .llseek = seq_lseek, \ + .release = single_release, \ +}; + /* debugfs.c */ extern void ceph_debugfs_init(void); extern void ceph_debugfs_cleanup(void); From patchwork Tue Jan 28 13:02:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11354241 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 989C0112B for ; Tue, 28 Jan 2020 13:03:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 779FE24683 for ; Tue, 28 Jan 2020 13:03:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="PYXAaRYr" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726143AbgA1NDs (ORCPT ); Tue, 28 Jan 2020 08:03:48 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:21888 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726034AbgA1NDs (ORCPT ); Tue, 28 Jan 2020 08:03:48 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580216627; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qT0u0w/rIGkT+szIUYmmBy4V5W+EnHMTgs0zakAMYac=; b=PYXAaRYr1Oy5CetBslb4+1zUQKunUEmyGiz2JC8iowMNUB33CESbTShOHo8vliN31SVRbi Vmhgb9tKTgtNI8mgdF6aAyBToPRIdEddJnc4Fpog5WAOHreR80fI8kCTNdtB8OPWG3XmHA SKgB3CFj3ZUcxwP2fSefG3lHo0uiK/c= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-69-eb1wbxVsMKaG20KWkbeFUg-1; Tue, 28 Jan 2020 08:03:45 -0500 X-MC-Unique: eb1wbxVsMKaG20KWkbeFUg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 13A21107ACC5; Tue, 28 Jan 2020 13:03:44 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-34.pek2.redhat.com [10.72.12.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id 57E8E80A5C; Tue, 28 Jan 2020 13:03:41 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com, zyan@redhat.com Cc: sage@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v5 09/10] ceph: add reset metrics support Date: Tue, 28 Jan 2020 08:02:47 -0500 Message-Id: <20200128130248.4266-10-xiubli@redhat.com> In-Reply-To: <20200128130248.4266-1-xiubli@redhat.com> References: <20200128130248.4266-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li This will reset the most metric counters, except the cap and dentry total numbers. Sometimes we need to discard the old metrics and start to get new metrics. URL: https://tracker.ceph.com/issues/43215 Signed-off-by: Xiubo Li --- fs/ceph/debugfs.c | 53 +++++++++++++++++++++++++++++++++++++++++++++-- fs/ceph/super.h | 1 + 2 files changed, 52 insertions(+), 2 deletions(-) diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c index 8aae7ecea54a..37ca1efa6b27 100644 --- a/fs/ceph/debugfs.c +++ b/fs/ceph/debugfs.c @@ -217,6 +217,55 @@ static int metric_show(struct seq_file *s, void *p) return 0; } +static ssize_t metric_store(struct file *file, const char __user *user_buf, + size_t count, loff_t *ppos) +{ + struct seq_file *s = file->private_data; + struct ceph_fs_client *fsc = s->private; + struct ceph_mds_client *mdsc = fsc->mdsc; + struct ceph_client_metric *metric = &mdsc->metric; + char buf[8]; + int i; + + if (copy_from_user(buf, user_buf, 8)) + return -EFAULT; + + if (strcmp(buf, "reset")) { + pr_err("Invalid set value '%s', only 'reset' is valid\n", buf); + return -EINVAL; + } + + percpu_counter_set(&metric->d_lease_hit, 0); + percpu_counter_set(&metric->d_lease_mis, 0); + + percpu_counter_set(&metric->read_latency_sum, 0); + percpu_counter_set(&metric->total_reads, 0); + + percpu_counter_set(&metric->write_latency_sum, 0); + percpu_counter_set(&metric->total_writes, 0); + + percpu_counter_set(&metric->metadata_latency_sum, 0); + percpu_counter_set(&metric->total_metadatas, 0); + + mutex_lock(&mdsc->mutex); + for (i = 0; i < mdsc->max_sessions; i++) { + struct ceph_mds_session *session; + + session = __ceph_lookup_mds_session(mdsc, i); + if (!session) + continue; + percpu_counter_set(&session->i_caps_hit, 0); + percpu_counter_set(&session->i_caps_mis, 0); + ceph_put_mds_session(session); + } + + mutex_unlock(&mdsc->mutex); + + return 0; +} + +CEPH_DEFINE_RW_FUNC(metric) + static int caps_show_cb(struct inode *inode, struct ceph_cap *cap, void *p) { struct seq_file *s = p; @@ -313,7 +362,6 @@ static int mds_sessions_show(struct seq_file *s, void *ptr) CEPH_DEFINE_SHOW_FUNC(mdsmap_show) CEPH_DEFINE_SHOW_FUNC(mdsc_show) -CEPH_DEFINE_SHOW_FUNC(metric_show) CEPH_DEFINE_SHOW_FUNC(caps_show) CEPH_DEFINE_SHOW_FUNC(mds_sessions_show) @@ -349,6 +397,7 @@ void ceph_fs_debugfs_cleanup(struct ceph_fs_client *fsc) debugfs_remove(fsc->debugfs_caps); debugfs_remove(fsc->debugfs_metric); debugfs_remove(fsc->debugfs_sending_metrics); + debugfs_remove(fsc->debugfs_reset_metrics); debugfs_remove(fsc->debugfs_mdsc); } @@ -400,7 +449,7 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc) 0400, fsc->client->debugfs_dir, fsc, - &metric_show_fops); + &metric_fops); fsc->debugfs_caps = debugfs_create_file("caps", 0400, diff --git a/fs/ceph/super.h b/fs/ceph/super.h index a91431e9bdf7..d24929f1c4bf 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -129,6 +129,7 @@ struct ceph_fs_client { struct dentry *debugfs_bdi; struct dentry *debugfs_mdsc, *debugfs_mdsmap; struct dentry *debugfs_sending_metrics; + struct dentry *debugfs_reset_metrics; struct dentry *debugfs_metric; struct dentry *debugfs_mds_sessions; #endif From patchwork Tue Jan 28 13:02:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11354243 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1C242112B for ; Tue, 28 Jan 2020 13:03:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EF59E24683 for ; Tue, 28 Jan 2020 13:03:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="asz+svzI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726182AbgA1ND5 (ORCPT ); Tue, 28 Jan 2020 08:03:57 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:58147 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726034AbgA1ND5 (ORCPT ); Tue, 28 Jan 2020 08:03:57 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580216636; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=95UghL6t9xaBh20+wUUUrrakBdh8t5YxMnU5/sz3IQA=; b=asz+svzIhk6tH05GmTHqhNpAhmYsA+iXvCQHayN/Rmgr0PfPcdMN5pRSUT73cqkhF7N6g9 izi5/M2zctmZfj82aGxaYEisTIixxKAT50a02elkwmtbNBIYAEa0e1Y75ijBaaat3Cb9uA aeE1H2BJ6fhMmV0f2WEMnMspmwDVziM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-275-Mdyo8dfoPkaW9lt6tH1V9g-1; Tue, 28 Jan 2020 08:03:48 -0500 X-MC-Unique: Mdyo8dfoPkaW9lt6tH1V9g-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 61A4D8010C9; Tue, 28 Jan 2020 13:03:47 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-34.pek2.redhat.com [10.72.12.34]) by smtp.corp.redhat.com (Postfix) with ESMTP id A1BD780A5C; Tue, 28 Jan 2020 13:03:44 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com, zyan@redhat.com Cc: sage@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v5 10/10] ceph: send client provided metric flags in client metadata Date: Tue, 28 Jan 2020 08:02:48 -0500 Message-Id: <20200128130248.4266-11-xiubli@redhat.com> In-Reply-To: <20200128130248.4266-1-xiubli@redhat.com> References: <20200128130248.4266-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Will send the metric flags to MDS, currently it supports the cap, dentry lease, read latency, write latency and metadata latency. URL: https://tracker.ceph.com/issues/43435 Signed-off-by: Xiubo Li --- fs/ceph/mds_client.c | 47 ++++++++++++++++++++++++++++++++++++++++++-- fs/ceph/metric.h | 14 +++++++++++++ 2 files changed, 59 insertions(+), 2 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index d765804dc855..f9d3acd36656 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -1096,6 +1096,41 @@ static void encode_supported_features(void **p, void *end) } } +static const unsigned char metric_bits[] = CEPHFS_METRIC_SPEC_CLIENT_SUPPORTED; +#define METRIC_BYTES(cnt) (DIV_ROUND_UP((size_t)metric_bits[cnt - 1] + 1, 64) * 8) +static void encode_metric_spec(void **p, void *end) +{ + static const size_t count = ARRAY_SIZE(metric_bits); + + /* header */ + BUG_ON(*p + 2 > end); + ceph_encode_8(p, 1); /* version */ + ceph_encode_8(p, 1); /* compat */ + + if (count > 0) { + size_t i; + size_t size = METRIC_BYTES(count); + + BUG_ON(*p + 4 + 4 + size > end); + + /* metric spec info length */ + ceph_encode_32(p, 4 + size); + + /* metric spec */ + ceph_encode_32(p, size); + memset(*p, 0, size); + for (i = 0; i < count; i++) + ((unsigned char*)(*p))[i / 8] |= BIT(metric_bits[i] % 8); + *p += size; + } else { + BUG_ON(*p + 4 + 4 > end); + /* metric spec info length */ + ceph_encode_32(p, 4); + /* metric spec */ + ceph_encode_32(p, 0); + } +} + /* * session message, specialization for CEPH_SESSION_REQUEST_OPEN * to include additional client metadata fields. @@ -1135,6 +1170,13 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6 size = FEATURE_BYTES(count); extra_bytes += 4 + size; + /* metric spec */ + size = 0; + count = ARRAY_SIZE(metric_bits); + if (count > 0) + size = METRIC_BYTES(count); + extra_bytes += 2 + 4 + 4 + size; + /* Allocate the message */ msg = ceph_msg_new(CEPH_MSG_CLIENT_SESSION, sizeof(*h) + extra_bytes, GFP_NOFS, false); @@ -1153,9 +1195,9 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6 * Serialize client metadata into waiting buffer space, using * the format that userspace expects for map * - * ClientSession messages with metadata are v3 + * ClientSession messages with metadata are v4 */ - msg->hdr.version = cpu_to_le16(3); + msg->hdr.version = cpu_to_le16(4); msg->hdr.compat_version = cpu_to_le16(1); /* The write pointer, following the session_head structure */ @@ -1178,6 +1220,7 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6 } encode_supported_features(&p, end); + encode_metric_spec(&p, end); msg->front.iov_len = p - msg->front.iov_base; msg->hdr.front_len = cpu_to_le32(msg->front.iov_len); diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h index 352eb753ce25..70e0b586b687 100644 --- a/fs/ceph/metric.h +++ b/fs/ceph/metric.h @@ -14,6 +14,20 @@ enum ceph_metric_type { CLIENT_METRIC_TYPE_MAX = CLIENT_METRIC_TYPE_DENTRY_LEASE, }; +/* + * This will always have the highest metric bit value + * as the last element of the array. + */ +#define CEPHFS_METRIC_SPEC_CLIENT_SUPPORTED { \ + CLIENT_METRIC_TYPE_CAP_INFO, \ + CLIENT_METRIC_TYPE_READ_LATENCY, \ + CLIENT_METRIC_TYPE_WRITE_LATENCY, \ + CLIENT_METRIC_TYPE_METADATA_LATENCY, \ + CLIENT_METRIC_TYPE_DENTRY_LEASE, \ + \ + CLIENT_METRIC_TYPE_MAX, \ +} + /* metric caps header */ struct ceph_metric_cap { __le32 type; /* ceph metric type */