Message ID | tencent_32C3AEB0599DF0A0010A862439636CDA2707@qq.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | RDMA/siw: Reuse value read using READ_ONCE instead of re-reading it | expand |
在 2024/3/9 13:27, linke li 写道: > In siw_orqe_start_rx, the orqe's flag in the if condition is read using > READ_ONCE, checked, and then re-read, voiding all guarantees of the > checks. Reuse the value that was read by READ_ONCE to ensure the > consistency of the flags throughout the function. > > Signed-off-by: linke li <lilinke99@qq.com> > --- > drivers/infiniband/sw/siw/siw_qp_rx.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/sw/siw/siw_qp_rx.c b/drivers/infiniband/sw/siw/siw_qp_rx.c > index ed4fc39718b4..f5f69de56882 100644 > --- a/drivers/infiniband/sw/siw/siw_qp_rx.c > +++ b/drivers/infiniband/sw/siw/siw_qp_rx.c > @@ -740,6 +740,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp) > { > struct siw_sqe *orqe; > struct siw_wqe *wqe = NULL; > + u16 orqe_flags; > > if (unlikely(!qp->attrs.orq_size)) > return -EPROTO; > @@ -748,7 +749,8 @@ static int siw_orqe_start_rx(struct siw_qp *qp) > smp_mb(); > > orqe = orq_get_current(qp); > - if (READ_ONCE(orqe->flags) & SIW_WQE_VALID) { In this if test, READ_ONCE is needed to read orqe->flags. But in this commit, this READ_ONCE is moved to other places. In a complicated environment, for example, this function is called many times at the same time and orqe->flags is changed at the same time, I am not sure if this will introduce risks or not. if you need to ensure the consistency of the flags throughout the function, not sure if the following is better or not. if (((orqe_flags=READ_ONCE(orqe->flags))) & SIW_WQE_VALID) { Thanks, Zhu Yanjun > + orqe_flags = READ_ONCE(orqe->flags); > + if (orqe_flags & SIW_WQE_VALID) { > /* RRESP is a TAGGED RDMAP operation */ > wqe = rx_wqe(&qp->rx_tagged); > wqe->sqe.id = orqe->id; > @@ -756,7 +758,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp) > wqe->sqe.sge[0].laddr = orqe->sge[0].laddr; > wqe->sqe.sge[0].lkey = orqe->sge[0].lkey; > wqe->sqe.sge[0].length = orqe->sge[0].length; > - wqe->sqe.flags = orqe->flags; > + wqe->sqe.flags = orqe_flags; > wqe->sqe.num_sge = 1; > wqe->bytes = orqe->sge[0].length; > wqe->processed = 0;
On Sat, Mar 09, 2024 at 08:27:16PM +0800, linke li wrote: > In siw_orqe_start_rx, the orqe's flag in the if condition is read using > READ_ONCE, checked, and then re-read, voiding all guarantees of the > checks. Reuse the value that was read by READ_ONCE to ensure the > consistency of the flags throughout the function. Please read include/asm-generic/rwonce.h comments when READ_ONCE() is used. There is no value in caching the output of READ_ONCE(). Thanks
I want to emphasize that if the value of orqe->flags has changed by the time of the second read, the value read will not satisfy the if condition, causing inconsistency. Given that there is already a READ_ONCE.
> In a complicated environment, for example, this function is called many > times at the same time and orqe->flags is changed at the same time, I am > not sure if this will introduce risks or not. I think one function of READ_ONCE is to read a valid value while the value may change concurrently. And there is a smp() above the READ_ONCE, which means that the READ_ONCE is well ordered. I think it is kind of safe here. > if you need to ensure the consistency of the flags throughout the function, not sure if the following is better or not. > if (((orqe_flags=READ_ONCE(orqe->flags))) & SIW_WQE_VALID) { This patch looks like exactly do the same things. The only difference I think is the code style. Thanks, Linke
On Sun, Mar 10, 2024 at 8:36 PM linke li <lilinke99@qq.com> wrote: > > > In a complicated environment, for example, this function is called many > > times at the same time and orqe->flags is changed at the same time, I am > > not sure if this will introduce risks or not. > > I think one function of READ_ONCE is to read a valid value while the value > may change concurrently. And there is a smp() above the READ_ONCE, which > means that the READ_ONCE is well ordered. I think it is kind of safe here. This is not a smp problem. Compared with the original source, your commit introduces a time slot. > > > if you need to ensure the consistency of the flags throughout the function, not sure if the following is better or not. > > > if (((orqe_flags=READ_ONCE(orqe->flags))) & SIW_WQE_VALID) { > > This patch looks like exactly do the same things. The only difference I > think is the code style. No. > > Thanks, > Linke > >
On Sun, Mar 10, 2024 at 7:33 PM Leon Romanovsky <leon@kernel.org> wrote: > > On Sat, Mar 09, 2024 at 08:27:16PM +0800, linke li wrote: > > In siw_orqe_start_rx, the orqe's flag in the if condition is read using > > READ_ONCE, checked, and then re-read, voiding all guarantees of the > > checks. Reuse the value that was read by READ_ONCE to ensure the > > consistency of the flags throughout the function. > > Please read include/asm-generic/rwonce.h comments when READ_ONCE() is used. > There is no value in caching the output of READ_ONCE(). Agree. Read the link https://www.kernel.org/doc/Documentation/memory-barriers.txt, too > > Thanks >
On Sun, Mar 10, 2024 at 08:15:25PM +0800, linke li wrote: > I want to emphasize that if the value of orqe->flags has changed by the > time of the second read, the value read will not satisfy the if condition, > causing inconsistency. Given that there is already a READ_ONCE. If value can change between subsequent reads, then you need to use locks to make sure that it doesn't happen. Using READ_ONCE() doesn't solve the concurrency issue, but makes sure that compiler doesn't reorder reads and writes. Thanks
> If value can change between subsequent reads, then you need to use locks > to make sure that it doesn't happen. Using READ_ONCE() doesn't solve the > concurrency issue, but makes sure that compiler doesn't reorder reads > and writes. This code do not need to prevent other thread from writing on the flags. This topic got quite a bit of discussion [1], quote from it: (READ_ONCE and WRITE_ONCE) That's often useful - lots of code doesn't really care if you get the old or the new value, but the code *does* care that it gets *one* value, and not some random mix of "I tested one value for validity, then it got reloaded due to register pressure, and I actually used another value". And not some "I read one value, and it was a mix of two other values". From the original code, the first read seems to do the same things. So READ_ONCE is probably ok here. I just want to make sure the flags stored to wqe->sqe.flags is consistent with the read used in the if condition. [1]https://lore.kernel.org/lkml/CAHk-=wgG6Dmt1JTXDbrbXh_6s2yLjL=9pHo7uv0==LHFD+aBtg@mail.gmail.com/
> This is not a smp problem. Compared with the original source, your > commit introduces a time slot. I don't know what do you mean by a time slot. In the binary level, they have the same code.
在 2024/3/11 3:34, linke li 写道: >> If value can change between subsequent reads, then you need to use locks >> to make sure that it doesn't happen. Using READ_ONCE() doesn't solve the >> concurrency issue, but makes sure that compiler doesn't reorder reads >> and writes. > > This code do not need to prevent other thread from writing on the flags. > > This topic got quite a bit of discussion [1], quote from it: > > (READ_ONCE and WRITE_ONCE) > That's often useful - lots of code doesn't really care if you get the > old or the new value, but the code *does* care that it gets *one* > value, and not some random mix of "I tested one value for validity, > then it got reloaded due to register pressure, and I actually used > another value". > > And not some "I read one value, and it was a mix of two other values". > > From the original code, the first read seems to do the same things. So > READ_ONCE is probably ok here. > > I just want to make sure the flags stored to wqe->sqe.flags is consistent > with the read used in the if condition. Sure. Follow Leon's advice, to make this ("wqe->sqe.flags is consistent with the read used in the if condition") happen, you need a lock to ensure it. The lock can be spin lock or mutex lock depens on its sleeping or not. From the original source code, wqe->sqe.flags should be a volatile variable. It should be read from the original source, not from cache. Zhu Yanjun > > [1]https://lore.kernel.org/lkml/CAHk-=wgG6Dmt1JTXDbrbXh_6s2yLjL=9pHo7uv0==LHFD+aBtg@mail.gmail.com/ >
In the original source code, READ_ONCE(xxx) is in if test. In your commit, you move READ_ONCE out of this if test. So the time slot exists between fetching and using. In the original source code, it does not exist. And the fetching and using are not protected by locks. As is suggested by Leon. This will introduce risks. The binary is based on optimization level and architectures. It is very complicated. Zhu Yanjun On 11.03.24 03:57, linke li wrote: >> This is not a smp problem. Compared with the original source, your >> commit introduces a time slot. > I don't know what do you mean by a time slot. In the binary level, they > have the same code. >
> -----Original Message----- > From: linke li <lilinke99@qq.com> > Sent: Saturday, March 9, 2024 1:27 PM > Cc: lilinke99@qq.com; Bernard Metzler <BMT@zurich.ibm.com>; Jason Gunthorpe > <jgg@ziepe.ca>; Leon Romanovsky <leon@kernel.org>; linux- > rdma@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: [EXTERNAL] [PATCH] RDMA/siw: Reuse value read using READ_ONCE > instead of re-reading it > > In siw_orqe_start_rx, the orqe's flag in the if condition is read using > READ_ONCE, checked, and then re-read, voiding all guarantees of the > checks. Reuse the value that was read by READ_ONCE to ensure the > consistency of the flags throughout the function. > > Signed-off-by: linke li <lilinke99@qq.com> > --- > drivers/infiniband/sw/siw/siw_qp_rx.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/sw/siw/siw_qp_rx.c > b/drivers/infiniband/sw/siw/siw_qp_rx.c > index ed4fc39718b4..f5f69de56882 100644 > --- a/drivers/infiniband/sw/siw/siw_qp_rx.c > +++ b/drivers/infiniband/sw/siw/siw_qp_rx.c > @@ -740,6 +740,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp) > { > struct siw_sqe *orqe; > struct siw_wqe *wqe = NULL; > + u16 orqe_flags; > > if (unlikely(!qp->attrs.orq_size)) > return -EPROTO; > @@ -748,7 +749,8 @@ static int siw_orqe_start_rx(struct siw_qp *qp) > smp_mb(); > > orqe = orq_get_current(qp); > - if (READ_ONCE(orqe->flags) & SIW_WQE_VALID) { > + orqe_flags = READ_ONCE(orqe->flags); > + if (orqe_flags & SIW_WQE_VALID) { > /* RRESP is a TAGGED RDMAP operation */ > wqe = rx_wqe(&qp->rx_tagged); > wqe->sqe.id = orqe->id; > @@ -756,7 +758,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp) > wqe->sqe.sge[0].laddr = orqe->sge[0].laddr; > wqe->sqe.sge[0].lkey = orqe->sge[0].lkey; > wqe->sqe.sge[0].length = orqe->sge[0].length; > - wqe->sqe.flags = orqe->flags; > + wqe->sqe.flags = orqe_flags; > wqe->sqe.num_sge = 1; > wqe->bytes = orqe->sge[0].length; > wqe->processed = 0; > -- > 2.39.3 (Apple Git-146) > > The outbound read queue (orq) is a ring buffer with only one consumer (this code) and one producer (READ.request sending code). There is no parallel reader and a single writer. The producer (sender of the READ.request) sets the orq entry valid and does this only once after completely writing the entry. It does it under qp->orq_lock. Only if we find the orq entry valid, its content gets copied at the beginning of a new READ.response (this code). The orq entry remains valid to stop the producer from re-using it until the complete READ.response has been received (may be multiple fragments). The flag gets cleared under qp->orq_lock after the complete READ.response has been received, or the response was invalid. There is no possibility a valid orq entry gets invalidated after it has been found valid, so it is safe to copy all its members. Thanks, Bernard.
Thank you for your reasonal reply. That makes sense. But you may still consider to make it better, like this patch, to read the flag only one time. It will avoid some potential risks. However, it depends on maintainer's choice. Linke Thanks
On Tue, Mar 12, 2024 at 09:30:53AM +0800, linke li wrote: > Thank you for your reasonal reply. That makes sense. But you may still > consider to make it better, like this patch, to read the flag only one > time. It will avoid some potential risks. However, it depends on > maintainer's choice. Maintainer doesn't see any potential risks and value is read only once anyway. Thanks > > Linke > Thanks > >
diff --git a/drivers/infiniband/sw/siw/siw_qp_rx.c b/drivers/infiniband/sw/siw/siw_qp_rx.c index ed4fc39718b4..f5f69de56882 100644 --- a/drivers/infiniband/sw/siw/siw_qp_rx.c +++ b/drivers/infiniband/sw/siw/siw_qp_rx.c @@ -740,6 +740,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp) { struct siw_sqe *orqe; struct siw_wqe *wqe = NULL; + u16 orqe_flags; if (unlikely(!qp->attrs.orq_size)) return -EPROTO; @@ -748,7 +749,8 @@ static int siw_orqe_start_rx(struct siw_qp *qp) smp_mb(); orqe = orq_get_current(qp); - if (READ_ONCE(orqe->flags) & SIW_WQE_VALID) { + orqe_flags = READ_ONCE(orqe->flags); + if (orqe_flags & SIW_WQE_VALID) { /* RRESP is a TAGGED RDMAP operation */ wqe = rx_wqe(&qp->rx_tagged); wqe->sqe.id = orqe->id; @@ -756,7 +758,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp) wqe->sqe.sge[0].laddr = orqe->sge[0].laddr; wqe->sqe.sge[0].lkey = orqe->sge[0].lkey; wqe->sqe.sge[0].length = orqe->sge[0].length; - wqe->sqe.flags = orqe->flags; + wqe->sqe.flags = orqe_flags; wqe->sqe.num_sge = 1; wqe->bytes = orqe->sge[0].length; wqe->processed = 0;
In siw_orqe_start_rx, the orqe's flag in the if condition is read using READ_ONCE, checked, and then re-read, voiding all guarantees of the checks. Reuse the value that was read by READ_ONCE to ensure the consistency of the flags throughout the function. Signed-off-by: linke li <lilinke99@qq.com> --- drivers/infiniband/sw/siw/siw_qp_rx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)