Message ID | 20230606005732.1056361-1-xiubli@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] ceph: only send metrics when the MDS rank is ready | expand |
Hi Xiubo, On Tue, Jun 6, 2023 at 6:30 AM <xiubli@redhat.com> wrote: > > From: Xiubo Li <xiubli@redhat.com> > > When the MDS rank is in clientreplay state, the metrics requests > will be discarded directly. Also, when there are a lot of known > client requests to recover from, the metrics requests will slow > down the MDS rank from getting to the active state sooner. > > With this patch, we will send the metrics requests only when the > MDS rank is in active state. Although the changes look fine. I have a question though - the metrics are sent by the client each second (on tick()) - how many clients were connected for the MDS to experience further slowness due to metric update messages? > > URL: https://tracker.ceph.com/issues/61524 > Reviewed-by: Milind Changire <mchangir@redhat.com> > Signed-off-by: Xiubo Li <xiubli@redhat.com> > --- > > V2: > - rephrase the commit comment from Milind's comments. > > > fs/ceph/metric.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/fs/ceph/metric.c b/fs/ceph/metric.c > index c47347d2e84e..cce78d769f55 100644 > --- a/fs/ceph/metric.c > +++ b/fs/ceph/metric.c > @@ -36,6 +36,14 @@ static bool ceph_mdsc_send_metrics(struct ceph_mds_client *mdsc, > s32 items = 0; > s32 len; > > + /* Do not send the metrics until the MDS rank is ready */ > + mutex_lock(&mdsc->mutex); > + if (ceph_mdsmap_get_state(mdsc->mdsmap, s->s_mds) != CEPH_MDS_STATE_ACTIVE) { > + mutex_unlock(&mdsc->mutex); > + return false; > + } > + mutex_unlock(&mdsc->mutex); > + > len = sizeof(*head) + sizeof(*cap) + sizeof(*read) + sizeof(*write) > + sizeof(*meta) + sizeof(*dlease) + sizeof(*files) > + sizeof(*icaps) + sizeof(*inodes) + sizeof(*rsize) > -- > 2.40.1 >
On 6/16/23 12:40, Venky Shankar wrote: > Hi Xiubo, > > On Tue, Jun 6, 2023 at 6:30 AM <xiubli@redhat.com> wrote: >> From: Xiubo Li <xiubli@redhat.com> >> >> When the MDS rank is in clientreplay state, the metrics requests >> will be discarded directly. Also, when there are a lot of known >> client requests to recover from, the metrics requests will slow >> down the MDS rank from getting to the active state sooner. >> >> With this patch, we will send the metrics requests only when the >> MDS rank is in active state. > Although the changes look fine. I have a question though - the metrics > are sent by the client each second (on tick()) - how many clients were > connected for the MDS to experience further slowness due to metric > update messages? As I remembered there were 1761 clients. The MDS will just queue the metric messages and then trigger to print one debug log message , which could slow down the MDS, to buffer and file(?) for each when handling them. Thanks - Xiubo > >> URL: https://tracker.ceph.com/issues/61524 >> Reviewed-by: Milind Changire <mchangir@redhat.com> >> Signed-off-by: Xiubo Li <xiubli@redhat.com> >> --- >> >> V2: >> - rephrase the commit comment from Milind's comments. >> >> >> fs/ceph/metric.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/fs/ceph/metric.c b/fs/ceph/metric.c >> index c47347d2e84e..cce78d769f55 100644 >> --- a/fs/ceph/metric.c >> +++ b/fs/ceph/metric.c >> @@ -36,6 +36,14 @@ static bool ceph_mdsc_send_metrics(struct ceph_mds_client *mdsc, >> s32 items = 0; >> s32 len; >> >> + /* Do not send the metrics until the MDS rank is ready */ >> + mutex_lock(&mdsc->mutex); >> + if (ceph_mdsmap_get_state(mdsc->mdsmap, s->s_mds) != CEPH_MDS_STATE_ACTIVE) { >> + mutex_unlock(&mdsc->mutex); >> + return false; >> + } >> + mutex_unlock(&mdsc->mutex); >> + >> len = sizeof(*head) + sizeof(*cap) + sizeof(*read) + sizeof(*write) >> + sizeof(*meta) + sizeof(*dlease) + sizeof(*files) >> + sizeof(*icaps) + sizeof(*inodes) + sizeof(*rsize) >> -- >> 2.40.1 >> >
diff --git a/fs/ceph/metric.c b/fs/ceph/metric.c index c47347d2e84e..cce78d769f55 100644 --- a/fs/ceph/metric.c +++ b/fs/ceph/metric.c @@ -36,6 +36,14 @@ static bool ceph_mdsc_send_metrics(struct ceph_mds_client *mdsc, s32 items = 0; s32 len; + /* Do not send the metrics until the MDS rank is ready */ + mutex_lock(&mdsc->mutex); + if (ceph_mdsmap_get_state(mdsc->mdsmap, s->s_mds) != CEPH_MDS_STATE_ACTIVE) { + mutex_unlock(&mdsc->mutex); + return false; + } + mutex_unlock(&mdsc->mutex); + len = sizeof(*head) + sizeof(*cap) + sizeof(*read) + sizeof(*write) + sizeof(*meta) + sizeof(*dlease) + sizeof(*files) + sizeof(*icaps) + sizeof(*inodes) + sizeof(*rsize)