diff mbox

[1/1] IB/rxe: drop QP0 silently

Message ID 1529551707-25612-1-git-send-email-yanjun.zhu@oracle.com (mailing list archive)
State Changes Requested
Headers show

Commit Message

Zhu Yanjun June 21, 2018, 3:28 a.m. UTC
According to "Annex A16: RDMA over Converged Ethernet (RoCE)",

A16.4.3 MANAGEMENT INTERFACES

As defined in the base specification, a special Queue Pair, QP0 is defined
solely for communication between subnet manager(s) and subnet management
agents. Since such an IB-defined subnet management architecture
is outside the scope of this annex, it follows that there is also no
requirement that a port which conforms to this annex be associated with
a QP0. Thus, for end nodes designed to conform to this annex, the concept
of QP0 is undefined and unused for any port connected to an
Ethernet network.
CA16-8: A packet arriving at a RoCE port containing a BTH with the
destination QP field set to QP0 shall be silently dropped.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
---
 drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Moni Shoua June 24, 2018, 3:22 p.m. UTC | #1
On Thu, Jun 21, 2018 at 6:30 AM Zhu Yanjun <yanjun.zhu@oracle.com> wrote:
>
> According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
>
> A16.4.3 MANAGEMENT INTERFACES
>
> As defined in the base specification, a special Queue Pair, QP0 is defined
> solely for communication between subnet manager(s) and subnet management
> agents. Since such an IB-defined subnet management architecture
> is outside the scope of this annex, it follows that there is also no
> requirement that a port which conforms to this annex be associated with
> a QP0. Thus, for end nodes designed to conform to this annex, the concept
> of QP0 is undefined and unused for any port connected to an
> Ethernet network.
> CA16-8: A packet arriving at a RoCE port containing a BTH with the
> destination QP field set to QP0 shall be silently dropped.
>
> CC: Srinivas Eeda <srinivas.eeda@oracle.com>
> CC: Junxiao Bi <junxiao.bi@oracle.com>
> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
> index b0c8d1e..08cb97a 100644
> --- a/drivers/infiniband/sw/rxe/rxe_recv.c
> +++ b/drivers/infiniband/sw/rxe/rxe_recv.c
> @@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
>                 goto err1;
>         }
>
> +       if (unlikely(qpn == 0)) {
> +               pr_warn_ratelimited("QP 0 is not supported\n");
> +               goto err1;
> +       }
> +
>         if (qpn != IB_MULTICAST_QPN) {
> -               index = (qpn == 0) ? port->qp_smi_index :
> -                       ((qpn == 1) ? port->qp_gsi_index : qpn);
> +               index = (qpn == 1) ? port->qp_gsi_index : qpn;
> +
>                 qp = rxe_pool_get_index(&rxe->qp_pool, index);
>                 if (unlikely(!qp)) {
>                         pr_warn_ratelimited("no qp matches qpn 0x%x\n", qpn);
> --
> 2.7.4
>
> --
Acked-by: Moni Shoua <monis@mellanox.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky June 24, 2018, 3:55 p.m. UTC | #2
On Wed, Jun 20, 2018 at 11:28:27PM -0400, Zhu Yanjun wrote:
> According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
>
> A16.4.3 MANAGEMENT INTERFACES
>
> As defined in the base specification, a special Queue Pair, QP0 is defined
> solely for communication between subnet manager(s) and subnet management
> agents. Since such an IB-defined subnet management architecture
> is outside the scope of this annex, it follows that there is also no
> requirement that a port which conforms to this annex be associated with
> a QP0. Thus, for end nodes designed to conform to this annex, the concept
> of QP0 is undefined and unused for any port connected to an
> Ethernet network.
> CA16-8: A packet arriving at a RoCE port containing a BTH with the
> destination QP field set to QP0 shall be silently dropped.
>
> CC: Srinivas Eeda <srinivas.eeda@oracle.com>
> CC: Junxiao Bi <junxiao.bi@oracle.com>
> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
> index b0c8d1e..08cb97a 100644
> --- a/drivers/infiniband/sw/rxe/rxe_recv.c
> +++ b/drivers/infiniband/sw/rxe/rxe_recv.c
> @@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
>  		goto err1;
>  	}
>
> +	if (unlikely(qpn == 0)) {
> +		pr_warn_ratelimited("QP 0 is not supported\n");

Safest option is not to print this warning at all.

> +		goto err1;
> +	}
> +
>  	if (qpn != IB_MULTICAST_QPN) {
> -		index = (qpn == 0) ? port->qp_smi_index :
> -			((qpn == 1) ? port->qp_gsi_index : qpn);
> +		index = (qpn == 1) ? port->qp_gsi_index : qpn;
> +
>  		qp = rxe_pool_get_index(&rxe->qp_pool, index);
>  		if (unlikely(!qp)) {
>  			pr_warn_ratelimited("no qp matches qpn 0x%x\n", qpn);
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhu Yanjun June 27, 2018, 3:22 a.m. UTC | #3
On 2018/6/24 23:55, Leon Romanovsky wrote:
> On Wed, Jun 20, 2018 at 11:28:27PM -0400, Zhu Yanjun wrote:
>> According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
>>
>> A16.4.3 MANAGEMENT INTERFACES
>>
>> As defined in the base specification, a special Queue Pair, QP0 is defined
>> solely for communication between subnet manager(s) and subnet management
>> agents. Since such an IB-defined subnet management architecture
>> is outside the scope of this annex, it follows that there is also no
>> requirement that a port which conforms to this annex be associated with
>> a QP0. Thus, for end nodes designed to conform to this annex, the concept
>> of QP0 is undefined and unused for any port connected to an
>> Ethernet network.
>> CA16-8: A packet arriving at a RoCE port containing a BTH with the
>> destination QP field set to QP0 shall be silently dropped.
>>
>> CC: Srinivas Eeda <srinivas.eeda@oracle.com>
>> CC: Junxiao Bi <junxiao.bi@oracle.com>
>> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
>> ---
>>   drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
>> index b0c8d1e..08cb97a 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_recv.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_recv.c
>> @@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
>>   		goto err1;
>>   	}
>>
>> +	if (unlikely(qpn == 0)) {
>> +		pr_warn_ratelimited("QP 0 is not supported\n");
> Safest option is not to print this warning at all.

Sorry to reply late. It is to notify the users that QP 0 is dropped. So 
this warning is necessary. Or else the user
will take time to investigate it.
In a word, this warning can save time.

Zhu Yanjun
>
>> +		goto err1;
>> +	}
>> +
>>   	if (qpn != IB_MULTICAST_QPN) {
>> -		index = (qpn == 0) ? port->qp_smi_index :
>> -			((qpn == 1) ? port->qp_gsi_index : qpn);
>> +		index = (qpn == 1) ? port->qp_gsi_index : qpn;
>> +
>>   		qp = rxe_pool_get_index(&rxe->qp_pool, index);
>>   		if (unlikely(!qp)) {
>>   			pr_warn_ratelimited("no qp matches qpn 0x%x\n", qpn);
>> --
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky June 27, 2018, 5:56 a.m. UTC | #4
On Wed, Jun 27, 2018 at 11:22:18AM +0800, Yanjun Zhu wrote:
>
>
> On 2018/6/24 23:55, Leon Romanovsky wrote:
> > On Wed, Jun 20, 2018 at 11:28:27PM -0400, Zhu Yanjun wrote:
> > > According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
> > >
> > > A16.4.3 MANAGEMENT INTERFACES
> > >
> > > As defined in the base specification, a special Queue Pair, QP0 is defined
> > > solely for communication between subnet manager(s) and subnet management
> > > agents. Since such an IB-defined subnet management architecture
> > > is outside the scope of this annex, it follows that there is also no
> > > requirement that a port which conforms to this annex be associated with
> > > a QP0. Thus, for end nodes designed to conform to this annex, the concept
> > > of QP0 is undefined and unused for any port connected to an
> > > Ethernet network.
> > > CA16-8: A packet arriving at a RoCE port containing a BTH with the
> > > destination QP field set to QP0 shall be silently dropped.
> > >
> > > CC: Srinivas Eeda <srinivas.eeda@oracle.com>
> > > CC: Junxiao Bi <junxiao.bi@oracle.com>
> > > Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
> > > ---
> > >   drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
> > >   1 file changed, 7 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
> > > index b0c8d1e..08cb97a 100644
> > > --- a/drivers/infiniband/sw/rxe/rxe_recv.c
> > > +++ b/drivers/infiniband/sw/rxe/rxe_recv.c
> > > @@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
> > >   		goto err1;
> > >   	}
> > >
> > > +	if (unlikely(qpn == 0)) {
> > > +		pr_warn_ratelimited("QP 0 is not supported\n");
> > Safest option is not to print this warning at all.
>
> Sorry to reply late. It is to notify the users that QP 0 is dropped. So this
> warning is necessary. Or else the user
> will take time to investigate it.
> In a word, this warning can save time.

I'm saying it from my experience of running default fedora 28 with
default opensm on RoCE system where I over flooded with messages
"ib_register_mad_agent: QP 0 not supported" from mad.c.

I wouldn't want to see same thing for RXE too.

Thanks

>
> Zhu Yanjun
> >
> > > +		goto err1;
> > > +	}
> > > +
> > >   	if (qpn != IB_MULTICAST_QPN) {
> > > -		index = (qpn == 0) ? port->qp_smi_index :
> > > -			((qpn == 1) ? port->qp_gsi_index : qpn);
> > > +		index = (qpn == 1) ? port->qp_gsi_index : qpn;
> > > +
> > >   		qp = rxe_pool_get_index(&rxe->qp_pool, index);
> > >   		if (unlikely(!qp)) {
> > >   			pr_warn_ratelimited("no qp matches qpn 0x%x\n", qpn);
> > > --
> > > 2.7.4
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Jason Gunthorpe June 27, 2018, 2:53 p.m. UTC | #5
On Wed, Jun 27, 2018 at 11:22:18AM +0800, Yanjun Zhu wrote:
> 
> 
> On 2018/6/24 23:55, Leon Romanovsky wrote:
> >On Wed, Jun 20, 2018 at 11:28:27PM -0400, Zhu Yanjun wrote:
> >>According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
> >>
> >>A16.4.3 MANAGEMENT INTERFACES
> >>
> >>As defined in the base specification, a special Queue Pair, QP0 is defined
> >>solely for communication between subnet manager(s) and subnet management
> >>agents. Since such an IB-defined subnet management architecture
> >>is outside the scope of this annex, it follows that there is also no
> >>requirement that a port which conforms to this annex be associated with
> >>a QP0. Thus, for end nodes designed to conform to this annex, the concept
> >>of QP0 is undefined and unused for any port connected to an
> >>Ethernet network.
> >>CA16-8: A packet arriving at a RoCE port containing a BTH with the
> >>destination QP field set to QP0 shall be silently dropped.
> >>
> >>CC: Srinivas Eeda <srinivas.eeda@oracle.com>
> >>CC: Junxiao Bi <junxiao.bi@oracle.com>
> >>Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
> >>  drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
> >>  1 file changed, 7 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
> >>index b0c8d1e..08cb97a 100644
> >>+++ b/drivers/infiniband/sw/rxe/rxe_recv.c
> >>@@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
> >>  		goto err1;
> >>  	}
> >>
> >>+	if (unlikely(qpn == 0)) {
> >>+		pr_warn_ratelimited("QP 0 is not supported\n");
> >Safest option is not to print this warning at all.
> 
> Sorry to reply late. It is to notify the users that QP 0 is dropped. So this
> warning is necessary. Or else the user
> will take time to investigate it.
> In a word, this warning can save time.

We rarely print based on received network traffic. You should drop it.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhu Yanjun June 28, 2018, 1:37 a.m. UTC | #6
On 2018/6/27 13:56, Leon Romanovsky wrote:
> On Wed, Jun 27, 2018 at 11:22:18AM +0800, Yanjun Zhu wrote:
>>
>> On 2018/6/24 23:55, Leon Romanovsky wrote:
>>> On Wed, Jun 20, 2018 at 11:28:27PM -0400, Zhu Yanjun wrote:
>>>> According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
>>>>
>>>> A16.4.3 MANAGEMENT INTERFACES
>>>>
>>>> As defined in the base specification, a special Queue Pair, QP0 is defined
>>>> solely for communication between subnet manager(s) and subnet management
>>>> agents. Since such an IB-defined subnet management architecture
>>>> is outside the scope of this annex, it follows that there is also no
>>>> requirement that a port which conforms to this annex be associated with
>>>> a QP0. Thus, for end nodes designed to conform to this annex, the concept
>>>> of QP0 is undefined and unused for any port connected to an
>>>> Ethernet network.
>>>> CA16-8: A packet arriving at a RoCE port containing a BTH with the
>>>> destination QP field set to QP0 shall be silently dropped.
>>>>
>>>> CC: Srinivas Eeda <srinivas.eeda@oracle.com>
>>>> CC: Junxiao Bi <junxiao.bi@oracle.com>
>>>> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
>>>> ---
>>>>    drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
>>>>    1 file changed, 7 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
>>>> index b0c8d1e..08cb97a 100644
>>>> --- a/drivers/infiniband/sw/rxe/rxe_recv.c
>>>> +++ b/drivers/infiniband/sw/rxe/rxe_recv.c
>>>> @@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
>>>>    		goto err1;
>>>>    	}
>>>>
>>>> +	if (unlikely(qpn == 0)) {
>>>> +		pr_warn_ratelimited("QP 0 is not supported\n");
>>> Safest option is not to print this warning at all.
>> Sorry to reply late. It is to notify the users that QP 0 is dropped. So this
>> warning is necessary. Or else the user
>> will take time to investigate it.
>> In a word, this warning can save time.
> I'm saying it from my experience of running default fedora 28 with
> default opensm on RoCE system where I over flooded with messages
> "ib_register_mad_agent: QP 0 not supported" from mad.c.
>
> I wouldn't want to see same thing for RXE too.

Thanks.
How about WARN_ONCE(1, "QP 0 is not supported\n");?
This only prints once.
This can avoid investigating and this will not print too much messages.

Zhu Yanjun

>
> Thanks
>
>> Zhu Yanjun
>>>> +		goto err1;
>>>> +	}
>>>> +
>>>>    	if (qpn != IB_MULTICAST_QPN) {
>>>> -		index = (qpn == 0) ? port->qp_smi_index :
>>>> -			((qpn == 1) ? port->qp_gsi_index : qpn);
>>>> +		index = (qpn == 1) ? port->qp_gsi_index : qpn;
>>>> +
>>>>    		qp = rxe_pool_get_index(&rxe->qp_pool, index);
>>>>    		if (unlikely(!qp)) {
>>>>    			pr_warn_ratelimited("no qp matches qpn 0x%x\n", qpn);
>>>> --
>>>> 2.7.4
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe June 28, 2018, 1:51 a.m. UTC | #7
On Thu, Jun 28, 2018 at 09:37:39AM +0800, Yanjun Zhu wrote:
> 
> 
> On 2018/6/27 13:56, Leon Romanovsky wrote:
> >On Wed, Jun 27, 2018 at 11:22:18AM +0800, Yanjun Zhu wrote:
> >>
> >>On 2018/6/24 23:55, Leon Romanovsky wrote:
> >>>On Wed, Jun 20, 2018 at 11:28:27PM -0400, Zhu Yanjun wrote:
> >>>>According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
> >>>>
> >>>>A16.4.3 MANAGEMENT INTERFACES
> >>>>
> >>>>As defined in the base specification, a special Queue Pair, QP0 is defined
> >>>>solely for communication between subnet manager(s) and subnet management
> >>>>agents. Since such an IB-defined subnet management architecture
> >>>>is outside the scope of this annex, it follows that there is also no
> >>>>requirement that a port which conforms to this annex be associated with
> >>>>a QP0. Thus, for end nodes designed to conform to this annex, the concept
> >>>>of QP0 is undefined and unused for any port connected to an
> >>>>Ethernet network.
> >>>>CA16-8: A packet arriving at a RoCE port containing a BTH with the
> >>>>destination QP field set to QP0 shall be silently dropped.
> >>>>
> >>>>CC: Srinivas Eeda <srinivas.eeda@oracle.com>
> >>>>CC: Junxiao Bi <junxiao.bi@oracle.com>
> >>>>Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
> >>>>   drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
> >>>>   1 file changed, 7 insertions(+), 2 deletions(-)
> >>>>
> >>>>diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
> >>>>index b0c8d1e..08cb97a 100644
> >>>>+++ b/drivers/infiniband/sw/rxe/rxe_recv.c
> >>>>@@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
> >>>>   		goto err1;
> >>>>   	}
> >>>>
> >>>>+	if (unlikely(qpn == 0)) {
> >>>>+		pr_warn_ratelimited("QP 0 is not supported\n");
> >>>Safest option is not to print this warning at all.
> >>Sorry to reply late. It is to notify the users that QP 0 is dropped. So this
> >>warning is necessary. Or else the user
> >>will take time to investigate it.
> >>In a word, this warning can save time.
> >I'm saying it from my experience of running default fedora 28 with
> >default opensm on RoCE system where I over flooded with messages
> >"ib_register_mad_agent: QP 0 not supported" from mad.c.
> >
> >I wouldn't want to see same thing for RXE too.
> 
> Thanks.
> How about WARN_ONCE(1, "QP 0 is not supported\n");?
> This only prints once.
> This can avoid investigating and this will not print too much messages.

Why are you worried about this? No roce end point should even be able
to send qp0 traffic. There is nothing to investigate.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhu Yanjun June 28, 2018, 4:22 a.m. UTC | #8
On 2018/6/28 9:51, Jason Gunthorpe wrote:
> On Thu, Jun 28, 2018 at 09:37:39AM +0800, Yanjun Zhu wrote:
>>
>> On 2018/6/27 13:56, Leon Romanovsky wrote:
>>> On Wed, Jun 27, 2018 at 11:22:18AM +0800, Yanjun Zhu wrote:
>>>> On 2018/6/24 23:55, Leon Romanovsky wrote:
>>>>> On Wed, Jun 20, 2018 at 11:28:27PM -0400, Zhu Yanjun wrote:
>>>>>> According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
>>>>>>
>>>>>> A16.4.3 MANAGEMENT INTERFACES
>>>>>>
>>>>>> As defined in the base specification, a special Queue Pair, QP0 is defined
>>>>>> solely for communication between subnet manager(s) and subnet management
>>>>>> agents. Since such an IB-defined subnet management architecture
>>>>>> is outside the scope of this annex, it follows that there is also no
>>>>>> requirement that a port which conforms to this annex be associated with
>>>>>> a QP0. Thus, for end nodes designed to conform to this annex, the concept
>>>>>> of QP0 is undefined and unused for any port connected to an
>>>>>> Ethernet network.
>>>>>> CA16-8: A packet arriving at a RoCE port containing a BTH with the
>>>>>> destination QP field set to QP0 shall be silently dropped.
>>>>>>
>>>>>> CC: Srinivas Eeda <srinivas.eeda@oracle.com>
>>>>>> CC: Junxiao Bi <junxiao.bi@oracle.com>
>>>>>> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
>>>>>>    drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
>>>>>>    1 file changed, 7 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
>>>>>> index b0c8d1e..08cb97a 100644
>>>>>> +++ b/drivers/infiniband/sw/rxe/rxe_recv.c
>>>>>> @@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
>>>>>>    		goto err1;
>>>>>>    	}
>>>>>>
>>>>>> +	if (unlikely(qpn == 0)) {
>>>>>> +		pr_warn_ratelimited("QP 0 is not supported\n");
>>>>> Safest option is not to print this warning at all.
>>>> Sorry to reply late. It is to notify the users that QP 0 is dropped. So this
>>>> warning is necessary. Or else the user
>>>> will take time to investigate it.
>>>> In a word, this warning can save time.
>>> I'm saying it from my experience of running default fedora 28 with
>>> default opensm on RoCE system where I over flooded with messages
>>> "ib_register_mad_agent: QP 0 not supported" from mad.c.
>>>
>>> I wouldn't want to see same thing for RXE too.
>> Thanks.
>> How about WARN_ONCE(1, "QP 0 is not supported\n");?
>> This only prints once.
>> This can avoid investigating and this will not print too much messages.
> Why are you worried about this? No roce end point should even be able
> to send qp0 traffic. There is nothing to investigate.

If in some case, we need to find out where QP 0 is dropped. With this 
warning, we can easily find
where QP 0 is dropped. Without this warning, maybe we will do some work.

Zhu Yanjun
>
> Jason
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky June 28, 2018, 4:27 a.m. UTC | #9
On Thu, Jun 28, 2018 at 12:22:41PM +0800, Yanjun Zhu wrote:
>
>
> On 2018/6/28 9:51, Jason Gunthorpe wrote:
> > On Thu, Jun 28, 2018 at 09:37:39AM +0800, Yanjun Zhu wrote:
> > >
> > > On 2018/6/27 13:56, Leon Romanovsky wrote:
> > > > On Wed, Jun 27, 2018 at 11:22:18AM +0800, Yanjun Zhu wrote:
> > > > > On 2018/6/24 23:55, Leon Romanovsky wrote:
> > > > > > On Wed, Jun 20, 2018 at 11:28:27PM -0400, Zhu Yanjun wrote:
> > > > > > > According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
> > > > > > >
> > > > > > > A16.4.3 MANAGEMENT INTERFACES
> > > > > > >
> > > > > > > As defined in the base specification, a special Queue Pair, QP0 is defined
> > > > > > > solely for communication between subnet manager(s) and subnet management
> > > > > > > agents. Since such an IB-defined subnet management architecture
> > > > > > > is outside the scope of this annex, it follows that there is also no
> > > > > > > requirement that a port which conforms to this annex be associated with
> > > > > > > a QP0. Thus, for end nodes designed to conform to this annex, the concept
> > > > > > > of QP0 is undefined and unused for any port connected to an
> > > > > > > Ethernet network.
> > > > > > > CA16-8: A packet arriving at a RoCE port containing a BTH with the
> > > > > > > destination QP field set to QP0 shall be silently dropped.
> > > > > > >
> > > > > > > CC: Srinivas Eeda <srinivas.eeda@oracle.com>
> > > > > > > CC: Junxiao Bi <junxiao.bi@oracle.com>
> > > > > > > Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
> > > > > > >    drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
> > > > > > >    1 file changed, 7 insertions(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
> > > > > > > index b0c8d1e..08cb97a 100644
> > > > > > > +++ b/drivers/infiniband/sw/rxe/rxe_recv.c
> > > > > > > @@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
> > > > > > >    		goto err1;
> > > > > > >    	}
> > > > > > >
> > > > > > > +	if (unlikely(qpn == 0)) {
> > > > > > > +		pr_warn_ratelimited("QP 0 is not supported\n");
> > > > > > Safest option is not to print this warning at all.
> > > > > Sorry to reply late. It is to notify the users that QP 0 is dropped. So this
> > > > > warning is necessary. Or else the user
> > > > > will take time to investigate it.
> > > > > In a word, this warning can save time.
> > > > I'm saying it from my experience of running default fedora 28 with
> > > > default opensm on RoCE system where I over flooded with messages
> > > > "ib_register_mad_agent: QP 0 not supported" from mad.c.
> > > >
> > > > I wouldn't want to see same thing for RXE too.
> > > Thanks.
> > > How about WARN_ONCE(1, "QP 0 is not supported\n");?
> > > This only prints once.
> > > This can avoid investigating and this will not print too much messages.
> > Why are you worried about this? No roce end point should even be able
> > to send qp0 traffic. There is nothing to investigate.
>
> If in some case, we need to find out where QP 0 is dropped. With this
> warning, we can easily find
> where QP 0 is dropped. Without this warning, maybe we will do some work.

Let's do not over engineer it.
WARN_ONCE will produce nice stack trace which is completely not needed.

Thanks

>
> Zhu Yanjun
> >
> > Jason
> >
>
Zhu Yanjun June 28, 2018, 4:31 a.m. UTC | #10
On 2018/6/28 12:27, Leon Romanovsky wrote:
> On Thu, Jun 28, 2018 at 12:22:41PM +0800, Yanjun Zhu wrote:
>>
>> On 2018/6/28 9:51, Jason Gunthorpe wrote:
>>> On Thu, Jun 28, 2018 at 09:37:39AM +0800, Yanjun Zhu wrote:
>>>> On 2018/6/27 13:56, Leon Romanovsky wrote:
>>>>> On Wed, Jun 27, 2018 at 11:22:18AM +0800, Yanjun Zhu wrote:
>>>>>> On 2018/6/24 23:55, Leon Romanovsky wrote:
>>>>>>> On Wed, Jun 20, 2018 at 11:28:27PM -0400, Zhu Yanjun wrote:
>>>>>>>> According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
>>>>>>>>
>>>>>>>> A16.4.3 MANAGEMENT INTERFACES
>>>>>>>>
>>>>>>>> As defined in the base specification, a special Queue Pair, QP0 is defined
>>>>>>>> solely for communication between subnet manager(s) and subnet management
>>>>>>>> agents. Since such an IB-defined subnet management architecture
>>>>>>>> is outside the scope of this annex, it follows that there is also no
>>>>>>>> requirement that a port which conforms to this annex be associated with
>>>>>>>> a QP0. Thus, for end nodes designed to conform to this annex, the concept
>>>>>>>> of QP0 is undefined and unused for any port connected to an
>>>>>>>> Ethernet network.
>>>>>>>> CA16-8: A packet arriving at a RoCE port containing a BTH with the
>>>>>>>> destination QP field set to QP0 shall be silently dropped.
>>>>>>>>
>>>>>>>> CC: Srinivas Eeda <srinivas.eeda@oracle.com>
>>>>>>>> CC: Junxiao Bi <junxiao.bi@oracle.com>
>>>>>>>> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
>>>>>>>>     drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
>>>>>>>>     1 file changed, 7 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
>>>>>>>> index b0c8d1e..08cb97a 100644
>>>>>>>> +++ b/drivers/infiniband/sw/rxe/rxe_recv.c
>>>>>>>> @@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
>>>>>>>>     		goto err1;
>>>>>>>>     	}
>>>>>>>>
>>>>>>>> +	if (unlikely(qpn == 0)) {
>>>>>>>> +		pr_warn_ratelimited("QP 0 is not supported\n");
>>>>>>> Safest option is not to print this warning at all.
>>>>>> Sorry to reply late. It is to notify the users that QP 0 is dropped. So this
>>>>>> warning is necessary. Or else the user
>>>>>> will take time to investigate it.
>>>>>> In a word, this warning can save time.
>>>>> I'm saying it from my experience of running default fedora 28 with
>>>>> default opensm on RoCE system where I over flooded with messages
>>>>> "ib_register_mad_agent: QP 0 not supported" from mad.c.
>>>>>
>>>>> I wouldn't want to see same thing for RXE too.
>>>> Thanks.
>>>> How about WARN_ONCE(1, "QP 0 is not supported\n");?
>>>> This only prints once.
>>>> This can avoid investigating and this will not print too much messages.
>>> Why are you worried about this? No roce end point should even be able
>>> to send qp0 traffic. There is nothing to investigate.
>> If in some case, we need to find out where QP 0 is dropped. With this
>> warning, we can easily find
>> where QP 0 is dropped. Without this warning, maybe we will do some work.
> Let's do not over engineer it.
> WARN_ONCE will produce nice stack trace which is completely not needed.

WARN_ONCE only produces "QP 0 is not supported". It will not produce 
stack trace.

WARN_ON will produce stack trace.

Thanks,
Zhu Yanjun
>
> Thanks
>
>> Zhu Yanjun
>>> Jason
>>>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky June 28, 2018, 4:52 a.m. UTC | #11
On Thu, Jun 28, 2018 at 12:31:14PM +0800, Yanjun Zhu wrote:
>
>
> On 2018/6/28 12:27, Leon Romanovsky wrote:
> > On Thu, Jun 28, 2018 at 12:22:41PM +0800, Yanjun Zhu wrote:
> > >
> > > On 2018/6/28 9:51, Jason Gunthorpe wrote:
> > > > On Thu, Jun 28, 2018 at 09:37:39AM +0800, Yanjun Zhu wrote:
> > > > > On 2018/6/27 13:56, Leon Romanovsky wrote:
> > > > > > On Wed, Jun 27, 2018 at 11:22:18AM +0800, Yanjun Zhu wrote:
> > > > > > > On 2018/6/24 23:55, Leon Romanovsky wrote:
> > > > > > > > On Wed, Jun 20, 2018 at 11:28:27PM -0400, Zhu Yanjun wrote:
> > > > > > > > > According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
> > > > > > > > >
> > > > > > > > > A16.4.3 MANAGEMENT INTERFACES
> > > > > > > > >
> > > > > > > > > As defined in the base specification, a special Queue Pair, QP0 is defined
> > > > > > > > > solely for communication between subnet manager(s) and subnet management
> > > > > > > > > agents. Since such an IB-defined subnet management architecture
> > > > > > > > > is outside the scope of this annex, it follows that there is also no
> > > > > > > > > requirement that a port which conforms to this annex be associated with
> > > > > > > > > a QP0. Thus, for end nodes designed to conform to this annex, the concept
> > > > > > > > > of QP0 is undefined and unused for any port connected to an
> > > > > > > > > Ethernet network.
> > > > > > > > > CA16-8: A packet arriving at a RoCE port containing a BTH with the
> > > > > > > > > destination QP field set to QP0 shall be silently dropped.
> > > > > > > > >
> > > > > > > > > CC: Srinivas Eeda <srinivas.eeda@oracle.com>
> > > > > > > > > CC: Junxiao Bi <junxiao.bi@oracle.com>
> > > > > > > > > Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
> > > > > > > > >     drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
> > > > > > > > >     1 file changed, 7 insertions(+), 2 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
> > > > > > > > > index b0c8d1e..08cb97a 100644
> > > > > > > > > +++ b/drivers/infiniband/sw/rxe/rxe_recv.c
> > > > > > > > > @@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
> > > > > > > > >     		goto err1;
> > > > > > > > >     	}
> > > > > > > > >
> > > > > > > > > +	if (unlikely(qpn == 0)) {
> > > > > > > > > +		pr_warn_ratelimited("QP 0 is not supported\n");
> > > > > > > > Safest option is not to print this warning at all.
> > > > > > > Sorry to reply late. It is to notify the users that QP 0 is dropped. So this
> > > > > > > warning is necessary. Or else the user
> > > > > > > will take time to investigate it.
> > > > > > > In a word, this warning can save time.
> > > > > > I'm saying it from my experience of running default fedora 28 with
> > > > > > default opensm on RoCE system where I over flooded with messages
> > > > > > "ib_register_mad_agent: QP 0 not supported" from mad.c.
> > > > > >
> > > > > > I wouldn't want to see same thing for RXE too.
> > > > > Thanks.
> > > > > How about WARN_ONCE(1, "QP 0 is not supported\n");?
> > > > > This only prints once.
> > > > > This can avoid investigating and this will not print too much messages.
> > > > Why are you worried about this? No roce end point should even be able
> > > > to send qp0 traffic. There is nothing to investigate.
> > > If in some case, we need to find out where QP 0 is dropped. With this
> > > warning, we can easily find
> > > where QP 0 is dropped. Without this warning, maybe we will do some work.
> > Let's do not over engineer it.
> > WARN_ONCE will produce nice stack trace which is completely not needed.
>
> WARN_ONCE only produces "QP 0 is not supported". It will not produce stack
> trace.

WARN_ONCE() -> WARN() -> __WARN_printf() ->  warn_slowpath_fmt() ->
__warn(... NULL ) -> dump_stack()

>
> WARN_ON will produce stack trace.
>
> Thanks,
> Zhu Yanjun
> >
> > Thanks
> >
> > > Zhu Yanjun
> > > > Jason
> > > >
>
Zhu Yanjun June 28, 2018, 5:03 a.m. UTC | #12
On 2018/6/28 12:52, Leon Romanovsky wrote:
> On Thu, Jun 28, 2018 at 12:31:14PM +0800, Yanjun Zhu wrote:
>>
>> On 2018/6/28 12:27, Leon Romanovsky wrote:
>>> On Thu, Jun 28, 2018 at 12:22:41PM +0800, Yanjun Zhu wrote:
>>>> On 2018/6/28 9:51, Jason Gunthorpe wrote:
>>>>> On Thu, Jun 28, 2018 at 09:37:39AM +0800, Yanjun Zhu wrote:
>>>>>> On 2018/6/27 13:56, Leon Romanovsky wrote:
>>>>>>> On Wed, Jun 27, 2018 at 11:22:18AM +0800, Yanjun Zhu wrote:
>>>>>>>> On 2018/6/24 23:55, Leon Romanovsky wrote:
>>>>>>>>> On Wed, Jun 20, 2018 at 11:28:27PM -0400, Zhu Yanjun wrote:
>>>>>>>>>> According to "Annex A16: RDMA over Converged Ethernet (RoCE)",
>>>>>>>>>>
>>>>>>>>>> A16.4.3 MANAGEMENT INTERFACES
>>>>>>>>>>
>>>>>>>>>> As defined in the base specification, a special Queue Pair, QP0 is defined
>>>>>>>>>> solely for communication between subnet manager(s) and subnet management
>>>>>>>>>> agents. Since such an IB-defined subnet management architecture
>>>>>>>>>> is outside the scope of this annex, it follows that there is also no
>>>>>>>>>> requirement that a port which conforms to this annex be associated with
>>>>>>>>>> a QP0. Thus, for end nodes designed to conform to this annex, the concept
>>>>>>>>>> of QP0 is undefined and unused for any port connected to an
>>>>>>>>>> Ethernet network.
>>>>>>>>>> CA16-8: A packet arriving at a RoCE port containing a BTH with the
>>>>>>>>>> destination QP field set to QP0 shall be silently dropped.
>>>>>>>>>>
>>>>>>>>>> CC: Srinivas Eeda <srinivas.eeda@oracle.com>
>>>>>>>>>> CC: Junxiao Bi <junxiao.bi@oracle.com>
>>>>>>>>>> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
>>>>>>>>>>      drivers/infiniband/sw/rxe/rxe_recv.c | 9 +++++++--
>>>>>>>>>>      1 file changed, 7 insertions(+), 2 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
>>>>>>>>>> index b0c8d1e..08cb97a 100644
>>>>>>>>>> +++ b/drivers/infiniband/sw/rxe/rxe_recv.c
>>>>>>>>>> @@ -225,9 +225,14 @@ static int hdr_check(struct rxe_pkt_info *pkt)
>>>>>>>>>>      		goto err1;
>>>>>>>>>>      	}
>>>>>>>>>>
>>>>>>>>>> +	if (unlikely(qpn == 0)) {
>>>>>>>>>> +		pr_warn_ratelimited("QP 0 is not supported\n");
>>>>>>>>> Safest option is not to print this warning at all.
>>>>>>>> Sorry to reply late. It is to notify the users that QP 0 is dropped. So this
>>>>>>>> warning is necessary. Or else the user
>>>>>>>> will take time to investigate it.
>>>>>>>> In a word, this warning can save time.
>>>>>>> I'm saying it from my experience of running default fedora 28 with
>>>>>>> default opensm on RoCE system where I over flooded with messages
>>>>>>> "ib_register_mad_agent: QP 0 not supported" from mad.c.
>>>>>>>
>>>>>>> I wouldn't want to see same thing for RXE too.
>>>>>> Thanks.
>>>>>> How about WARN_ONCE(1, "QP 0 is not supported\n");?
>>>>>> This only prints once.
>>>>>> This can avoid investigating and this will not print too much messages.
>>>>> Why are you worried about this? No roce end point should even be able
>>>>> to send qp0 traffic. There is nothing to investigate.
>>>> If in some case, we need to find out where QP 0 is dropped. With this
>>>> warning, we can easily find
>>>> where QP 0 is dropped. Without this warning, maybe we will do some work.
>>> Let's do not over engineer it.
>>> WARN_ONCE will produce nice stack trace which is completely not needed.
>> WARN_ONCE only produces "QP 0 is not supported". It will not produce stack
>> trace.
> WARN_ONCE() -> WARN() -> __WARN_printf() ->  warn_slowpath_fmt() ->
> __warn(... NULL ) -> dump_stack()

Sure. The WARN_ONCE should be removed. I will send V2 soon.:-)

Thanks,
Zhu Yanjun

>
>> WARN_ON will produce stack trace.
>>
>> Thanks,
>> Zhu Yanjun
>>> Thanks
>>>
>>>> Zhu Yanjun
>>>>> Jason
>>>>>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
index b0c8d1e..08cb97a 100644
--- a/drivers/infiniband/sw/rxe/rxe_recv.c
+++ b/drivers/infiniband/sw/rxe/rxe_recv.c
@@ -225,9 +225,14 @@  static int hdr_check(struct rxe_pkt_info *pkt)
 		goto err1;
 	}
 
+	if (unlikely(qpn == 0)) {
+		pr_warn_ratelimited("QP 0 is not supported\n");
+		goto err1;
+	}
+
 	if (qpn != IB_MULTICAST_QPN) {
-		index = (qpn == 0) ? port->qp_smi_index :
-			((qpn == 1) ? port->qp_gsi_index : qpn);
+		index = (qpn == 1) ? port->qp_gsi_index : qpn;
+
 		qp = rxe_pool_get_index(&rxe->qp_pool, index);
 		if (unlikely(!qp)) {
 			pr_warn_ratelimited("no qp matches qpn 0x%x\n", qpn);