Message ID | 20230612114359.220895-1-xiubli@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | ceph: print the client global id for debug logs | expand |
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
On Mon, Jun 12, 2023 at 1:46 PM <xiubli@redhat.com> wrote: > > From: Xiubo Li <xiubli@redhat.com> > > Multiple cephfs mounts on a host is increasingly common so disambiguating > messages like this is necessary and will make it easier to debug > issues. > > URL: https://tracker.ceph.com/issues/61590 > Cc: Patrick Donnelly <pdonnell@redhat.com> > Signed-off-by: Xiubo Li <xiubli@redhat.com> > --- > fs/ceph/acl.c | 6 +- > fs/ceph/addr.c | 300 ++++++++++-------- > fs/ceph/caps.c | 709 ++++++++++++++++++++++++------------------- > fs/ceph/crypto.c | 45 ++- > fs/ceph/debugfs.c | 4 +- > fs/ceph/dir.c | 222 +++++++++----- > fs/ceph/export.c | 39 ++- > fs/ceph/file.c | 268 +++++++++------- > fs/ceph/inode.c | 528 ++++++++++++++++++-------------- > fs/ceph/ioctl.c | 10 +- > fs/ceph/locks.c | 62 ++-- > fs/ceph/mds_client.c | 616 +++++++++++++++++++++---------------- > fs/ceph/mdsmap.c | 25 +- > fs/ceph/metric.c | 5 +- > fs/ceph/quota.c | 31 +- > fs/ceph/snap.c | 186 +++++++----- > fs/ceph/super.c | 64 ++-- > fs/ceph/xattr.c | 97 +++--- > 18 files changed, 1887 insertions(+), 1330 deletions(-) > > diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c > index 8a56f979c7cb..970acd07908d 100644 > --- a/fs/ceph/acl.c > +++ b/fs/ceph/acl.c > @@ -15,6 +15,7 @@ > #include <linux/slab.h> > > #include "super.h" > +#include "mds_client.h" > > static inline void ceph_set_cached_acl(struct inode *inode, > int type, struct posix_acl *acl) > @@ -31,6 +32,7 @@ static inline void ceph_set_cached_acl(struct inode *inode, > > struct posix_acl *ceph_get_acl(struct inode *inode, int type, bool rcu) > { > + struct ceph_client *cl = ceph_inode_to_client(inode); > int size; > unsigned int retry_cnt = 0; > const char *name; > @@ -72,8 +74,8 @@ struct posix_acl *ceph_get_acl(struct inode *inode, int type, bool rcu) > } else if (size == -ENODATA || size == 0) { > acl = NULL; > } else { > - pr_err_ratelimited("get acl %llx.%llx failed, err=%d\n", > - ceph_vinop(inode), size); > + pr_err_ratelimited_client(cl, "%s %llx.%llx failed, err=%d\n", > + __func__, ceph_vinop(inode), size); > acl = ERR_PTR(-EIO); > } > > diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c > index e62318b3e13d..c772639dc0cb 100644 > --- a/fs/ceph/addr.c > +++ b/fs/ceph/addr.c > @@ -79,18 +79,18 @@ static inline struct ceph_snap_context *page_snap_context(struct page *page) > */ > static bool ceph_dirty_folio(struct address_space *mapping, struct folio *folio) > { > - struct inode *inode; > + struct inode *inode = mapping->host; > + struct ceph_client *cl = ceph_inode_to_client(inode); > struct ceph_inode_info *ci; > struct ceph_snap_context *snapc; > > if (folio_test_dirty(folio)) { > - dout("%p dirty_folio %p idx %lu -- already dirty\n", > - mapping->host, folio, folio->index); > + dout_client(cl, "%s %llx.%llx %p idx %lu -- already dirty\n", > + __func__, ceph_vinop(inode), folio, folio->index); While having context information attached to each dout is nice, it certainly comes at a price of a lot of churn and automated backport disruption. I wonder how much value doing this for douts as opposed to just pr_* messages actually brings? A regular user never sees douts as enabling them without extra care tends to be useless -- journald can't cope with the volume and quickly starts discarding them. From the developer side, many douts already include at least one (hashed) pointer value which is usually sufficient to disambiguate, even if it's more cumbersome than a grep on context information. Thanks, Ilya
On 6/13/23 17:07, Ilya Dryomov wrote: > On Mon, Jun 12, 2023 at 1:46 PM <xiubli@redhat.com> wrote: >> From: Xiubo Li <xiubli@redhat.com> >> >> Multiple cephfs mounts on a host is increasingly common so disambiguating >> messages like this is necessary and will make it easier to debug >> issues. >> >> URL: https://tracker.ceph.com/issues/61590 >> Cc: Patrick Donnelly <pdonnell@redhat.com> >> Signed-off-by: Xiubo Li <xiubli@redhat.com> >> --- >> fs/ceph/acl.c | 6 +- >> fs/ceph/addr.c | 300 ++++++++++-------- >> fs/ceph/caps.c | 709 ++++++++++++++++++++++++------------------- >> fs/ceph/crypto.c | 45 ++- >> fs/ceph/debugfs.c | 4 +- >> fs/ceph/dir.c | 222 +++++++++----- >> fs/ceph/export.c | 39 ++- >> fs/ceph/file.c | 268 +++++++++------- >> fs/ceph/inode.c | 528 ++++++++++++++++++-------------- >> fs/ceph/ioctl.c | 10 +- >> fs/ceph/locks.c | 62 ++-- >> fs/ceph/mds_client.c | 616 +++++++++++++++++++++---------------- >> fs/ceph/mdsmap.c | 25 +- >> fs/ceph/metric.c | 5 +- >> fs/ceph/quota.c | 31 +- >> fs/ceph/snap.c | 186 +++++++----- >> fs/ceph/super.c | 64 ++-- >> fs/ceph/xattr.c | 97 +++--- >> 18 files changed, 1887 insertions(+), 1330 deletions(-) >> >> diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c >> index 8a56f979c7cb..970acd07908d 100644 >> --- a/fs/ceph/acl.c >> +++ b/fs/ceph/acl.c >> @@ -15,6 +15,7 @@ >> #include <linux/slab.h> >> >> #include "super.h" >> +#include "mds_client.h" >> >> static inline void ceph_set_cached_acl(struct inode *inode, >> int type, struct posix_acl *acl) >> @@ -31,6 +32,7 @@ static inline void ceph_set_cached_acl(struct inode *inode, >> >> struct posix_acl *ceph_get_acl(struct inode *inode, int type, bool rcu) >> { >> + struct ceph_client *cl = ceph_inode_to_client(inode); >> int size; >> unsigned int retry_cnt = 0; >> const char *name; >> @@ -72,8 +74,8 @@ struct posix_acl *ceph_get_acl(struct inode *inode, int type, bool rcu) >> } else if (size == -ENODATA || size == 0) { >> acl = NULL; >> } else { >> - pr_err_ratelimited("get acl %llx.%llx failed, err=%d\n", >> - ceph_vinop(inode), size); >> + pr_err_ratelimited_client(cl, "%s %llx.%llx failed, err=%d\n", >> + __func__, ceph_vinop(inode), size); >> acl = ERR_PTR(-EIO); >> } >> >> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c >> index e62318b3e13d..c772639dc0cb 100644 >> --- a/fs/ceph/addr.c >> +++ b/fs/ceph/addr.c >> @@ -79,18 +79,18 @@ static inline struct ceph_snap_context *page_snap_context(struct page *page) >> */ >> static bool ceph_dirty_folio(struct address_space *mapping, struct folio *folio) >> { >> - struct inode *inode; >> + struct inode *inode = mapping->host; >> + struct ceph_client *cl = ceph_inode_to_client(inode); >> struct ceph_inode_info *ci; >> struct ceph_snap_context *snapc; >> >> if (folio_test_dirty(folio)) { >> - dout("%p dirty_folio %p idx %lu -- already dirty\n", >> - mapping->host, folio, folio->index); >> + dout_client(cl, "%s %llx.%llx %p idx %lu -- already dirty\n", >> + __func__, ceph_vinop(inode), folio, folio->index); > While having context information attached to each dout is nice, it > certainly comes at a price of a lot of churn and automated backport > disruption. Yeah, certainly this will break automated backporting. But this should be okay, I can generate the backport patches for each stable release for this patch series, so after this it will make the automated backporting work. > I wonder how much value doing this for douts as opposed > to just pr_* messages actually brings? I think the 'dout()' was introduced by printing more context info, which includes module/function names and line#, when the CONFIG_CEPH_LIB_PRETTYDEBUG is enabled. Maybe we can remove CONFIG_CEPH_LIB_PRETTYDEBUG now, since the pr_* will print the module name and also the caller for dout() and pr_* will print the function name mostly ? > A regular user never sees douts as enabling them without extra care > tends to be useless -- journald can't cope with the volume and quickly > starts discarding them. From the developer side, many douts already > include at least one (hashed) pointer value which is usually sufficient > to disambiguate, even if it's more cumbersome than a grep on context > information. Just in some cases the client global_id is really could help much, especially for the issues only could be hit in productions case, which could have hundreds of clients, and we can only enable some dout() debug logs. Thanks - Xiubo > > Thanks, > > Ilya >
On Tue, Jun 13, 2023 at 11:51 AM Xiubo Li <xiubli@redhat.com> wrote: > > > On 6/13/23 17:07, Ilya Dryomov wrote: > > On Mon, Jun 12, 2023 at 1:46 PM <xiubli@redhat.com> wrote: > >> From: Xiubo Li <xiubli@redhat.com> > >> > >> Multiple cephfs mounts on a host is increasingly common so disambiguating > >> messages like this is necessary and will make it easier to debug > >> issues. > >> > >> URL: https://tracker.ceph.com/issues/61590 > >> Cc: Patrick Donnelly <pdonnell@redhat.com> > >> Signed-off-by: Xiubo Li <xiubli@redhat.com> > >> --- > >> fs/ceph/acl.c | 6 +- > >> fs/ceph/addr.c | 300 ++++++++++-------- > >> fs/ceph/caps.c | 709 ++++++++++++++++++++++++------------------- > >> fs/ceph/crypto.c | 45 ++- > >> fs/ceph/debugfs.c | 4 +- > >> fs/ceph/dir.c | 222 +++++++++----- > >> fs/ceph/export.c | 39 ++- > >> fs/ceph/file.c | 268 +++++++++------- > >> fs/ceph/inode.c | 528 ++++++++++++++++++-------------- > >> fs/ceph/ioctl.c | 10 +- > >> fs/ceph/locks.c | 62 ++-- > >> fs/ceph/mds_client.c | 616 +++++++++++++++++++++---------------- > >> fs/ceph/mdsmap.c | 25 +- > >> fs/ceph/metric.c | 5 +- > >> fs/ceph/quota.c | 31 +- > >> fs/ceph/snap.c | 186 +++++++----- > >> fs/ceph/super.c | 64 ++-- > >> fs/ceph/xattr.c | 97 +++--- > >> 18 files changed, 1887 insertions(+), 1330 deletions(-) > >> > >> diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c > >> index 8a56f979c7cb..970acd07908d 100644 > >> --- a/fs/ceph/acl.c > >> +++ b/fs/ceph/acl.c > >> @@ -15,6 +15,7 @@ > >> #include <linux/slab.h> > >> > >> #include "super.h" > >> +#include "mds_client.h" > >> > >> static inline void ceph_set_cached_acl(struct inode *inode, > >> int type, struct posix_acl *acl) > >> @@ -31,6 +32,7 @@ static inline void ceph_set_cached_acl(struct inode *inode, > >> > >> struct posix_acl *ceph_get_acl(struct inode *inode, int type, bool rcu) > >> { > >> + struct ceph_client *cl = ceph_inode_to_client(inode); > >> int size; > >> unsigned int retry_cnt = 0; > >> const char *name; > >> @@ -72,8 +74,8 @@ struct posix_acl *ceph_get_acl(struct inode *inode, int type, bool rcu) > >> } else if (size == -ENODATA || size == 0) { > >> acl = NULL; > >> } else { > >> - pr_err_ratelimited("get acl %llx.%llx failed, err=%d\n", > >> - ceph_vinop(inode), size); > >> + pr_err_ratelimited_client(cl, "%s %llx.%llx failed, err=%d\n", > >> + __func__, ceph_vinop(inode), size); > >> acl = ERR_PTR(-EIO); > >> } > >> > >> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c > >> index e62318b3e13d..c772639dc0cb 100644 > >> --- a/fs/ceph/addr.c > >> +++ b/fs/ceph/addr.c > >> @@ -79,18 +79,18 @@ static inline struct ceph_snap_context *page_snap_context(struct page *page) > >> */ > >> static bool ceph_dirty_folio(struct address_space *mapping, struct folio *folio) > >> { > >> - struct inode *inode; > >> + struct inode *inode = mapping->host; > >> + struct ceph_client *cl = ceph_inode_to_client(inode); > >> struct ceph_inode_info *ci; > >> struct ceph_snap_context *snapc; > >> > >> if (folio_test_dirty(folio)) { > >> - dout("%p dirty_folio %p idx %lu -- already dirty\n", > >> - mapping->host, folio, folio->index); > >> + dout_client(cl, "%s %llx.%llx %p idx %lu -- already dirty\n", > >> + __func__, ceph_vinop(inode), folio, folio->index); > > While having context information attached to each dout is nice, it > > certainly comes at a price of a lot of churn and automated backport > > disruption. > > Yeah, certainly this will break automated backporting. But this should > be okay, I can generate the backport patches for each stable release for > this patch series, so after this it will make the automated backporting > work. > > > I wonder how much value doing this for douts as opposed > > to just pr_* messages actually brings? > > I think the 'dout()' was introduced by printing more context info, which > includes module/function names and line#, when the > CONFIG_CEPH_LIB_PRETTYDEBUG is enabled. dout() is just a wrapper around pr_debug(). It doesn't have anything to do with CONFIG_CEPH_LIB_PRETTYDEBUG per se which adds file names and line numbers. IIRC dout() by itself just adds a space at the front to make debugging spew stand out. > > Maybe we can remove CONFIG_CEPH_LIB_PRETTYDEBUG now, since the pr_* will > print the module name and also the caller for dout() and pr_* will print > the function name mostly ? To the best of my knowledge, CONFIG_CEPH_LIB_PRETTYDEBUG has always been disabled by default and it's not enabled by any distribution in their kernels. I don't think anyone out there would miss it, but then it's not hurting either -- it's less than 20 lines of code with all ifdef-ery included. Thanks, Ilya
On 6/14/23 05:50, Ilya Dryomov wrote: > On Tue, Jun 13, 2023 at 11:51 AM Xiubo Li <xiubli@redhat.com> wrote: >> >> On 6/13/23 17:07, Ilya Dryomov wrote: >>> On Mon, Jun 12, 2023 at 1:46 PM <xiubli@redhat.com> wrote: >>>> From: Xiubo Li <xiubli@redhat.com> >>>> >>>> Multiple cephfs mounts on a host is increasingly common so disambiguating >>>> messages like this is necessary and will make it easier to debug >>>> issues. >>>> >>>> URL: https://tracker.ceph.com/issues/61590 >>>> Cc: Patrick Donnelly <pdonnell@redhat.com> >>>> Signed-off-by: Xiubo Li <xiubli@redhat.com> >>>> --- >>>> fs/ceph/acl.c | 6 +- >>>> fs/ceph/addr.c | 300 ++++++++++-------- >>>> fs/ceph/caps.c | 709 ++++++++++++++++++++++++------------------- >>>> fs/ceph/crypto.c | 45 ++- >>>> fs/ceph/debugfs.c | 4 +- >>>> fs/ceph/dir.c | 222 +++++++++----- >>>> fs/ceph/export.c | 39 ++- >>>> fs/ceph/file.c | 268 +++++++++------- >>>> fs/ceph/inode.c | 528 ++++++++++++++++++-------------- >>>> fs/ceph/ioctl.c | 10 +- >>>> fs/ceph/locks.c | 62 ++-- >>>> fs/ceph/mds_client.c | 616 +++++++++++++++++++++---------------- >>>> fs/ceph/mdsmap.c | 25 +- >>>> fs/ceph/metric.c | 5 +- >>>> fs/ceph/quota.c | 31 +- >>>> fs/ceph/snap.c | 186 +++++++----- >>>> fs/ceph/super.c | 64 ++-- >>>> fs/ceph/xattr.c | 97 +++--- >>>> 18 files changed, 1887 insertions(+), 1330 deletions(-) >>>> >>>> diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c >>>> index 8a56f979c7cb..970acd07908d 100644 >>>> --- a/fs/ceph/acl.c >>>> +++ b/fs/ceph/acl.c >>>> @@ -15,6 +15,7 @@ >>>> #include <linux/slab.h> >>>> >>>> #include "super.h" >>>> +#include "mds_client.h" >>>> >>>> static inline void ceph_set_cached_acl(struct inode *inode, >>>> int type, struct posix_acl *acl) >>>> @@ -31,6 +32,7 @@ static inline void ceph_set_cached_acl(struct inode *inode, >>>> >>>> struct posix_acl *ceph_get_acl(struct inode *inode, int type, bool rcu) >>>> { >>>> + struct ceph_client *cl = ceph_inode_to_client(inode); >>>> int size; >>>> unsigned int retry_cnt = 0; >>>> const char *name; >>>> @@ -72,8 +74,8 @@ struct posix_acl *ceph_get_acl(struct inode *inode, int type, bool rcu) >>>> } else if (size == -ENODATA || size == 0) { >>>> acl = NULL; >>>> } else { >>>> - pr_err_ratelimited("get acl %llx.%llx failed, err=%d\n", >>>> - ceph_vinop(inode), size); >>>> + pr_err_ratelimited_client(cl, "%s %llx.%llx failed, err=%d\n", >>>> + __func__, ceph_vinop(inode), size); >>>> acl = ERR_PTR(-EIO); >>>> } >>>> >>>> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c >>>> index e62318b3e13d..c772639dc0cb 100644 >>>> --- a/fs/ceph/addr.c >>>> +++ b/fs/ceph/addr.c >>>> @@ -79,18 +79,18 @@ static inline struct ceph_snap_context *page_snap_context(struct page *page) >>>> */ >>>> static bool ceph_dirty_folio(struct address_space *mapping, struct folio *folio) >>>> { >>>> - struct inode *inode; >>>> + struct inode *inode = mapping->host; >>>> + struct ceph_client *cl = ceph_inode_to_client(inode); >>>> struct ceph_inode_info *ci; >>>> struct ceph_snap_context *snapc; >>>> >>>> if (folio_test_dirty(folio)) { >>>> - dout("%p dirty_folio %p idx %lu -- already dirty\n", >>>> - mapping->host, folio, folio->index); >>>> + dout_client(cl, "%s %llx.%llx %p idx %lu -- already dirty\n", >>>> + __func__, ceph_vinop(inode), folio, folio->index); >>> While having context information attached to each dout is nice, it >>> certainly comes at a price of a lot of churn and automated backport >>> disruption. >> Yeah, certainly this will break automated backporting. But this should >> be okay, I can generate the backport patches for each stable release for >> this patch series, so after this it will make the automated backporting >> work. >> >>> I wonder how much value doing this for douts as opposed >>> to just pr_* messages actually brings? >> I think the 'dout()' was introduced by printing more context info, which >> includes module/function names and line#, when the >> CONFIG_CEPH_LIB_PRETTYDEBUG is enabled. > dout() is just a wrapper around pr_debug(). It doesn't have anything > to do with CONFIG_CEPH_LIB_PRETTYDEBUG per se which adds file names and > line numbers. IIRC dout() by itself just adds a space at the front to > make debugging spew stand out. > >> Maybe we can remove CONFIG_CEPH_LIB_PRETTYDEBUG now, since the pr_* will >> print the module name and also the caller for dout() and pr_* will print >> the function name mostly ? > To the best of my knowledge, CONFIG_CEPH_LIB_PRETTYDEBUG has always > been disabled by default and it's not enabled by any distribution in > their kernels. I don't think anyone out there would miss it, but then > it's not hurting either -- it's less than 20 lines of code with all > ifdef-ery included. Okay. I won't touch it. Thanks > Thanks, > > Ilya >
From: Xiubo Li <xiubli@redhat.com> V2: - print the client global id for all the dout()/pr_warn(),etc. This is based the testing branch. Xiubo Li (6): ceph: add the *_client debug macros support ceph: pass the mdsc to several helpers ceph: rename _to_client() to _to_fs_client() ceph: move mdsmap.h to fs/ceph/ ceph: add ceph_inode_to_client() helper support ceph: print the client global_id in all the debug logs fs/ceph/acl.c | 6 +- fs/ceph/addr.c | 320 +++++++----- fs/ceph/cache.c | 2 +- fs/ceph/caps.c | 762 ++++++++++++++++------------ fs/ceph/crypto.c | 47 +- fs/ceph/debugfs.c | 8 +- fs/ceph/dir.c | 242 +++++---- fs/ceph/export.c | 49 +- fs/ceph/file.c | 294 ++++++----- fs/ceph/inode.c | 542 +++++++++++--------- fs/ceph/ioctl.c | 18 +- fs/ceph/locks.c | 62 ++- fs/ceph/mds_client.c | 655 ++++++++++++++---------- fs/ceph/mds_client.h | 5 +- fs/ceph/mdsmap.c | 30 +- {include/linux => fs}/ceph/mdsmap.h | 5 +- fs/ceph/metric.c | 5 +- fs/ceph/quota.c | 31 +- fs/ceph/snap.c | 204 ++++---- fs/ceph/super.c | 86 ++-- fs/ceph/super.h | 19 +- fs/ceph/xattr.c | 109 ++-- include/linux/ceph/ceph_debug.h | 38 +- 23 files changed, 2078 insertions(+), 1461 deletions(-) rename {include/linux => fs}/ceph/mdsmap.h (92%)