Message ID | 1572925690-7336-1-git-send-email-liuyixian@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [rdma-next] RDMA/core: Use pr_warn_ratelimited | expand |
Hi Yixian Liu, > -----Original Message----- > From: linux-rdma-owner@vger.kernel.org <linux-rdma- > owner@vger.kernel.org> On Behalf Of Yixian Liu > Sent: Monday, November 4, 2019 9:48 PM > To: dledford@redhat.com; jgg@ziepe.ca > Cc: Parav Pandit <parav@mellanox.com>; leon@kernel.org; linux- > rdma@vger.kernel.org; linuxarm@huawei.com > Subject: [PATCH rdma-next] RDMA/core: Use pr_warn_ratelimited > > This warning can be triggered easily when adding a large number of VLANs > while the capacity of GID table is small. > > Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE") > Signed-off-by: Yixian Liu <liuyixian@huawei.com> > --- > drivers/infiniband/core/cache.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > Thanks for the patch. However, vlan netdevice addition is an administrative operation allowed to process which has CAP_NET_ADMIN privilege. So large number of VLAN addition by a user for a RoCE device whose GID capacity is small is constrained operation. Therefore, we shouldn't be rate limiting it. By rate limiting we miss the information about any bugs in GID cache management. At best we can convert it to dev_dbg() so that we still get the necessary debug information when needed. I wanted to convert them trace functions which will allow us to more debug level prints such as netdev name, vlan etc. I think I remember comment from Leon too while working on it, but it was long haul that time. Its base infrastructure is just got ready with Chuck Lever's patch. So around 5.5, I should convert to trace calls. > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c > index 00fb3ea..2990075 100644 > --- a/drivers/infiniband/core/cache.c > +++ b/drivers/infiniband/core/cache.c > @@ -579,8 +579,8 @@ static int __ib_cache_gid_add(struct ib_device > *ib_dev, u8 port, > out_unlock: > mutex_unlock(&table->lock); > if (ret) > - pr_warn("%s: unable to add gid %pI6 error=%d\n", > - __func__, gid->raw, ret); > + pr_warn_ratelimited("%s: unable to add gid %pI6 > error=%d\n", > + __func__, gid->raw, ret); > return ret; > } > > -- > 2.7.4
On 2019/11/5 12:09, Parav Pandit wrote: > Hi Yixian Liu, > >> -----Original Message----- >> From: linux-rdma-owner@vger.kernel.org <linux-rdma- >> owner@vger.kernel.org> On Behalf Of Yixian Liu >> Sent: Monday, November 4, 2019 9:48 PM >> To: dledford@redhat.com; jgg@ziepe.ca >> Cc: Parav Pandit <parav@mellanox.com>; leon@kernel.org; linux- >> rdma@vger.kernel.org; linuxarm@huawei.com >> Subject: [PATCH rdma-next] RDMA/core: Use pr_warn_ratelimited >> >> This warning can be triggered easily when adding a large number of VLANs >> while the capacity of GID table is small. >> >> Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE") >> Signed-off-by: Yixian Liu <liuyixian@huawei.com> >> --- >> drivers/infiniband/core/cache.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> > Thanks for the patch. However, vlan netdevice addition is an administrative operation allowed to process which has CAP_NET_ADMIN privilege. > So large number of VLAN addition by a user for a RoCE device whose GID capacity is small is constrained operation. How can we constrain this operation from an administrator? > Therefore, we shouldn't be rate limiting it. > By rate limiting we miss the information about any bugs in GID cache management. pr_warn_ratelimited only prevent from printing **the same messages** frequently, why information will be lost? > At best we can convert it to dev_dbg() so that we still get the necessary debug information when needed. > I wanted to convert them trace functions which will allow us to more debug level prints such as netdev name, vlan etc. > I think I remember comment from Leon too while working on it, but it was long haul that time. > > Its base infrastructure is just got ready with Chuck Lever's patch. > So around 5.5, I should convert to trace calls. Okay, whatever, the situation described in commit message may be occurred, please consider it. > >> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c >> index 00fb3ea..2990075 100644 >> --- a/drivers/infiniband/core/cache.c >> +++ b/drivers/infiniband/core/cache.c >> @@ -579,8 +579,8 @@ static int __ib_cache_gid_add(struct ib_device >> *ib_dev, u8 port, >> out_unlock: >> mutex_unlock(&table->lock); >> if (ret) >> - pr_warn("%s: unable to add gid %pI6 error=%d\n", >> - __func__, gid->raw, ret); >> + pr_warn_ratelimited("%s: unable to add gid %pI6 >> error=%d\n", >> + __func__, gid->raw, ret); >> return ret; >> } >> >> -- >> 2.7.4 > > >
On Tue, Nov 05, 2019 at 04:09:12AM +0000, Parav Pandit wrote: > Hi Yixian Liu, > > > -----Original Message----- > > From: linux-rdma-owner@vger.kernel.org <linux-rdma- > > owner@vger.kernel.org> On Behalf Of Yixian Liu > > Sent: Monday, November 4, 2019 9:48 PM > > To: dledford@redhat.com; jgg@ziepe.ca > > Cc: Parav Pandit <parav@mellanox.com>; leon@kernel.org; linux- > > rdma@vger.kernel.org; linuxarm@huawei.com > > Subject: [PATCH rdma-next] RDMA/core: Use pr_warn_ratelimited > > > > This warning can be triggered easily when adding a large number of VLANs > > while the capacity of GID table is small. > > > > Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE") > > Signed-off-by: Yixian Liu <liuyixian@huawei.com> > > --- > > drivers/infiniband/core/cache.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > Thanks for the patch. However, vlan netdevice addition is an administrative operation allowed to process which has CAP_NET_ADMIN privilege. > So large number of VLAN addition by a user for a RoCE device whose GID capacity is small is constrained operation. > Therefore, we shouldn't be rate limiting it. > By rate limiting we miss the information about any bugs in GID cache management. > At best we can convert it to dev_dbg() so that we still get the necessary debug information when needed. > I wanted to convert them trace functions which will allow us to more debug level prints such as netdev name, vlan etc. > I think I remember comment from Leon too while working on it, but it was long haul that time. I wanted to work on it, but it never happened. Thanks
> -----Original Message----- > From: Liuyixian (Eason) <liuyixian@huawei.com> > Sent: Tuesday, November 5, 2019 1:55 AM > To: Parav Pandit <parav@mellanox.com>; dledford@redhat.com; > jgg@ziepe.ca > Cc: leon@kernel.org; linux-rdma@vger.kernel.org; linuxarm@huawei.com > Subject: Re: [PATCH rdma-next] RDMA/core: Use pr_warn_ratelimited > > > > On 2019/11/5 12:09, Parav Pandit wrote: > > Hi Yixian Liu, > > > >> -----Original Message----- > >> From: linux-rdma-owner@vger.kernel.org <linux-rdma- > >> owner@vger.kernel.org> On Behalf Of Yixian Liu > >> Sent: Monday, November 4, 2019 9:48 PM > >> To: dledford@redhat.com; jgg@ziepe.ca > >> Cc: Parav Pandit <parav@mellanox.com>; leon@kernel.org; linux- > >> rdma@vger.kernel.org; linuxarm@huawei.com > >> Subject: [PATCH rdma-next] RDMA/core: Use pr_warn_ratelimited > >> > >> This warning can be triggered easily when adding a large number of > >> VLANs while the capacity of GID table is small. > >> > >> Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE") > >> Signed-off-by: Yixian Liu <liuyixian@huawei.com> > >> --- > >> drivers/infiniband/core/cache.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > > Thanks for the patch. However, vlan netdevice addition is an > administrative operation allowed to process which has CAP_NET_ADMIN > privilege. > > So large number of VLAN addition by a user for a RoCE device whose GID > capacity is small is constrained operation. > > How can we constrain this operation from an administrator? > Administrator knows the GID table size of the RoCE device he is managing. > > Therefore, we shouldn't be rate limiting it. > > By rate limiting we miss the information about any bugs in GID cache > management. > > pr_warn_ratelimited only prevent from printing **the same messages** > frequently, why information will be lost? > Same message that may have different GID value and return code. So information is lost on why GID entry was not able to add (error by vendor driver, no space in table, something else etc). > > At best we can convert it to dev_dbg() so that we still get the necessary > debug information when needed. > > I wanted to convert them trace functions which will allow us to more > debug level prints such as netdev name, vlan etc. > > I think I remember comment from Leon too while working on it, but it was > long haul that time. > > > > Its base infrastructure is just got ready with Chuck Lever's patch. > > So around 5.5, I should convert to trace calls. > > Okay, whatever, the situation described in commit message may be > occurred, please consider it. > Yes. sure. Added to my todo list. > > > >> diff --git a/drivers/infiniband/core/cache.c > >> b/drivers/infiniband/core/cache.c index 00fb3ea..2990075 100644 > >> --- a/drivers/infiniband/core/cache.c > >> +++ b/drivers/infiniband/core/cache.c > >> @@ -579,8 +579,8 @@ static int __ib_cache_gid_add(struct ib_device > >> *ib_dev, u8 port, > >> out_unlock: > >> mutex_unlock(&table->lock); > >> if (ret) > >> - pr_warn("%s: unable to add gid %pI6 error=%d\n", > >> - __func__, gid->raw, ret); > >> + pr_warn_ratelimited("%s: unable to add gid %pI6 > >> error=%d\n", > >> + __func__, gid->raw, ret); > >> return ret; > >> } > >> > >> -- > >> 2.7.4 > > > > > >
On 2019/11/5 22:41, Leon Romanovsky wrote: > On Tue, Nov 05, 2019 at 04:09:12AM +0000, Parav Pandit wrote: >> Hi Yixian Liu, >> >>> -----Original Message----- >>> From: linux-rdma-owner@vger.kernel.org <linux-rdma- >>> owner@vger.kernel.org> On Behalf Of Yixian Liu >>> Sent: Monday, November 4, 2019 9:48 PM >>> To: dledford@redhat.com; jgg@ziepe.ca >>> Cc: Parav Pandit <parav@mellanox.com>; leon@kernel.org; linux- >>> rdma@vger.kernel.org; linuxarm@huawei.com >>> Subject: [PATCH rdma-next] RDMA/core: Use pr_warn_ratelimited >>> >>> This warning can be triggered easily when adding a large number of VLANs >>> while the capacity of GID table is small. >>> >>> Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE") >>> Signed-off-by: Yixian Liu <liuyixian@huawei.com> >>> --- >>> drivers/infiniband/core/cache.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >> Thanks for the patch. However, vlan netdevice addition is an administrative operation allowed to process which has CAP_NET_ADMIN privilege. >> So large number of VLAN addition by a user for a RoCE device whose GID capacity is small is constrained operation. >> Therefore, we shouldn't be rate limiting it. >> By rate limiting we miss the information about any bugs in GID cache management. >> At best we can convert it to dev_dbg() so that we still get the necessary debug information when needed. >> I wanted to convert them trace functions which will allow us to more debug level prints such as netdev name, vlan etc. >> I think I remember comment from Leon too while working on it, but it was long haul that time. > > I wanted to work on it, but it never happened. I see, thanks. > > Thanks > > . >
On 2019/11/5 23:02, Parav Pandit wrote: > > >> -----Original Message----- >> From: Liuyixian (Eason) <liuyixian@huawei.com> >> Sent: Tuesday, November 5, 2019 1:55 AM >> To: Parav Pandit <parav@mellanox.com>; dledford@redhat.com; >> jgg@ziepe.ca >> Cc: leon@kernel.org; linux-rdma@vger.kernel.org; linuxarm@huawei.com >> Subject: Re: [PATCH rdma-next] RDMA/core: Use pr_warn_ratelimited >> >> >> >> On 2019/11/5 12:09, Parav Pandit wrote: >>> Hi Yixian Liu, >>> >>>> -----Original Message----- >>>> From: linux-rdma-owner@vger.kernel.org <linux-rdma- >>>> owner@vger.kernel.org> On Behalf Of Yixian Liu >>>> Sent: Monday, November 4, 2019 9:48 PM >>>> To: dledford@redhat.com; jgg@ziepe.ca >>>> Cc: Parav Pandit <parav@mellanox.com>; leon@kernel.org; linux- >>>> rdma@vger.kernel.org; linuxarm@huawei.com >>>> Subject: [PATCH rdma-next] RDMA/core: Use pr_warn_ratelimited >>>> >>>> This warning can be triggered easily when adding a large number of >>>> VLANs while the capacity of GID table is small. >>>> >>>> Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE") >>>> Signed-off-by: Yixian Liu <liuyixian@huawei.com> >>>> --- >>>> drivers/infiniband/core/cache.c | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>> Thanks for the patch. However, vlan netdevice addition is an >> administrative operation allowed to process which has CAP_NET_ADMIN >> privilege. >>> So large number of VLAN addition by a user for a RoCE device whose GID >> capacity is small is constrained operation. >> >> How can we constrain this operation from an administrator? >> > Administrator knows the GID table size of the RoCE device he is managing. > >>> Therefore, we shouldn't be rate limiting it. >>> By rate limiting we miss the information about any bugs in GID cache >> management. >> >> pr_warn_ratelimited only prevent from printing **the same messages** >> frequently, why information will be lost? >> > Same message that may have different GID value and return code. > So information is lost on why GID entry was not able to add (error by vendor driver, no space in table, something else etc). > > >>> At best we can convert it to dev_dbg() so that we still get the necessary >> debug information when needed. >>> I wanted to convert them trace functions which will allow us to more >> debug level prints such as netdev name, vlan etc. >>> I think I remember comment from Leon too while working on it, but it was >> long haul that time. >>> >>> Its base infrastructure is just got ready with Chuck Lever's patch. >>> So around 5.5, I should convert to trace calls. >> >> Okay, whatever, the situation described in commit message may be >> occurred, please consider it. >> > Yes. sure. Added to my todo list. Thanks. > >>> >>>> diff --git a/drivers/infiniband/core/cache.c >>>> b/drivers/infiniband/core/cache.c index 00fb3ea..2990075 100644 >>>> --- a/drivers/infiniband/core/cache.c >>>> +++ b/drivers/infiniband/core/cache.c >>>> @@ -579,8 +579,8 @@ static int __ib_cache_gid_add(struct ib_device >>>> *ib_dev, u8 port, >>>> out_unlock: >>>> mutex_unlock(&table->lock); >>>> if (ret) >>>> - pr_warn("%s: unable to add gid %pI6 error=%d\n", >>>> - __func__, gid->raw, ret); >>>> + pr_warn_ratelimited("%s: unable to add gid %pI6 >>>> error=%d\n", >>>> + __func__, gid->raw, ret); >>>> return ret; >>>> } >>>> >>>> -- >>>> 2.7.4 >>> >>> >>> >
diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c index 00fb3ea..2990075 100644 --- a/drivers/infiniband/core/cache.c +++ b/drivers/infiniband/core/cache.c @@ -579,8 +579,8 @@ static int __ib_cache_gid_add(struct ib_device *ib_dev, u8 port, out_unlock: mutex_unlock(&table->lock); if (ret) - pr_warn("%s: unable to add gid %pI6 error=%d\n", - __func__, gid->raw, ret); + pr_warn_ratelimited("%s: unable to add gid %pI6 error=%d\n", + __func__, gid->raw, ret); return ret; }
This warning can be triggered easily when adding a large number of VLANs while the capacity of GID table is small. Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE") Signed-off-by: Yixian Liu <liuyixian@huawei.com> --- drivers/infiniband/core/cache.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)