Message ID | 4641d8f79a88b07925cab0d8cd1ffc032a9115ef.1706185318.git.leon@kernel.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Collection of mlx5_ib fixes | expand |
On 2024/1/25 20:30, Leon Romanovsky wrote: > From: Or Har-Toov <ohartoov@nvidia.com> > > In the dereg flow, UMEM is not a good enough indication whether an MR > is from userspace since in mlx5_ib_rereg_user_mr there are some cases > when a new MR is created and the UMEM of the old MR is set to NULL. > Currently when mlx5_ib_dereg_mr is called on the old MR, UMEM is NULL > but cache_ent can be different than NULL. So, the mkey will not be > destroyed. > Therefore checking if mkey is from user application and cacheable > should be done by checking if rb_key or cache_ent exist and all other kind of > mkeys should be destroyed. > > Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") > Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > --- > drivers/infiniband/hw/mlx5/mr.c | 15 ++++++++------- > 1 file changed, 8 insertions(+), 7 deletions(-) > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c > index 12bca6ca4760..3c241898e064 100644 > --- a/drivers/infiniband/hw/mlx5/mr.c > +++ b/drivers/infiniband/hw/mlx5/mr.c > @@ -1857,6 +1857,11 @@ static int cache_ent_find_and_store(struct mlx5_ib_dev *dev, > return ret; > } > > +static bool is_cacheable_mkey(struct mlx5_ib_mkey mkey) I think it's better using a pointer as the parameter instead of the struct itself. Junxian > +{ > + return mkey.cache_ent || mkey.rb_key.ndescs; > +} > + > int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) > { > struct mlx5_ib_mr *mr = to_mmr(ibmr); > @@ -1901,12 +1906,6 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) > mr->sig = NULL; > } > > - /* Stop DMA */ > - if (mr->umem && mlx5r_umr_can_load_pas(dev, mr->umem->length)) > - if (mlx5r_umr_revoke_mr(mr) || > - cache_ent_find_and_store(dev, mr)) > - mr->mmkey.cache_ent = NULL; > - > if (mr->umem && mr->umem->is_peer) { > rc = mlx5r_umr_revoke_mr(mr); > if (rc) > @@ -1914,7 +1913,9 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) > ib_umem_stop_invalidation_notifier(mr->umem); > } > > - if (!mr->mmkey.cache_ent) { > + /* Stop DMA */ > + if (!is_cacheable_mkey(mr->mmkey) || mlx5r_umr_revoke_mr(mr) || > + cache_ent_find_and_store(dev, mr)) { > rc = destroy_mkey(to_mdev(mr->ibmr.device), mr); > if (rc) > return rc;
在 2024/1/25 20:52, Junxian Huang 写道: > > > On 2024/1/25 20:30, Leon Romanovsky wrote: >> From: Or Har-Toov <ohartoov@nvidia.com> >> >> In the dereg flow, UMEM is not a good enough indication whether an MR >> is from userspace since in mlx5_ib_rereg_user_mr there are some cases >> when a new MR is created and the UMEM of the old MR is set to NULL. >> Currently when mlx5_ib_dereg_mr is called on the old MR, UMEM is NULL >> but cache_ent can be different than NULL. So, the mkey will not be >> destroyed. >> Therefore checking if mkey is from user application and cacheable >> should be done by checking if rb_key or cache_ent exist and all other kind of >> mkeys should be destroyed. >> >> Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") >> Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> >> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> >> --- >> drivers/infiniband/hw/mlx5/mr.c | 15 ++++++++------- >> 1 file changed, 8 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c >> index 12bca6ca4760..3c241898e064 100644 >> --- a/drivers/infiniband/hw/mlx5/mr.c >> +++ b/drivers/infiniband/hw/mlx5/mr.c >> @@ -1857,6 +1857,11 @@ static int cache_ent_find_and_store(struct mlx5_ib_dev *dev, >> return ret; >> } >> >> +static bool is_cacheable_mkey(struct mlx5_ib_mkey mkey) > > I think it's better using a pointer as the parameter instead of the struct itself. Why do you think that a pointer is better that the struct itself? In kernel doc, is there any rule about this? Thanks. Zhu Yanjun > > Junxian > >> +{ >> + return mkey.cache_ent || mkey.rb_key.ndescs; >> +} >> + >> int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) >> { >> struct mlx5_ib_mr *mr = to_mmr(ibmr); >> @@ -1901,12 +1906,6 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) >> mr->sig = NULL; >> } >> >> - /* Stop DMA */ >> - if (mr->umem && mlx5r_umr_can_load_pas(dev, mr->umem->length)) >> - if (mlx5r_umr_revoke_mr(mr) || >> - cache_ent_find_and_store(dev, mr)) >> - mr->mmkey.cache_ent = NULL; >> - >> if (mr->umem && mr->umem->is_peer) { >> rc = mlx5r_umr_revoke_mr(mr); >> if (rc) >> @@ -1914,7 +1913,9 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) >> ib_umem_stop_invalidation_notifier(mr->umem); >> } >> >> - if (!mr->mmkey.cache_ent) { >> + /* Stop DMA */ >> + if (!is_cacheable_mkey(mr->mmkey) || mlx5r_umr_revoke_mr(mr) || >> + cache_ent_find_and_store(dev, mr)) { >> rc = destroy_mkey(to_mdev(mr->ibmr.device), mr); >> if (rc) >> return rc;
On Thu, Jan 25, 2024 at 08:52:57PM +0800, Junxian Huang wrote: > > > On 2024/1/25 20:30, Leon Romanovsky wrote: > > From: Or Har-Toov <ohartoov@nvidia.com> > > > > In the dereg flow, UMEM is not a good enough indication whether an MR > > is from userspace since in mlx5_ib_rereg_user_mr there are some cases > > when a new MR is created and the UMEM of the old MR is set to NULL. > > Currently when mlx5_ib_dereg_mr is called on the old MR, UMEM is NULL > > but cache_ent can be different than NULL. So, the mkey will not be > > destroyed. > > Therefore checking if mkey is from user application and cacheable > > should be done by checking if rb_key or cache_ent exist and all other kind of > > mkeys should be destroyed. > > > > Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") > > Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > --- > > drivers/infiniband/hw/mlx5/mr.c | 15 ++++++++------- > > 1 file changed, 8 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c > > index 12bca6ca4760..3c241898e064 100644 > > --- a/drivers/infiniband/hw/mlx5/mr.c > > +++ b/drivers/infiniband/hw/mlx5/mr.c > > @@ -1857,6 +1857,11 @@ static int cache_ent_find_and_store(struct mlx5_ib_dev *dev, > > return ret; > > } > > > > +static bool is_cacheable_mkey(struct mlx5_ib_mkey mkey) > > I think it's better using a pointer as the parameter instead of the struct itself. Indeed, that looks like a typo Thanks, Jason
On Thu, Jan 25, 2024 at 09:38:24AM -0400, Jason Gunthorpe wrote: > On Thu, Jan 25, 2024 at 08:52:57PM +0800, Junxian Huang wrote: > > > > > > On 2024/1/25 20:30, Leon Romanovsky wrote: > > > From: Or Har-Toov <ohartoov@nvidia.com> > > > > > > In the dereg flow, UMEM is not a good enough indication whether an MR > > > is from userspace since in mlx5_ib_rereg_user_mr there are some cases > > > when a new MR is created and the UMEM of the old MR is set to NULL. > > > Currently when mlx5_ib_dereg_mr is called on the old MR, UMEM is NULL > > > but cache_ent can be different than NULL. So, the mkey will not be > > > destroyed. > > > Therefore checking if mkey is from user application and cacheable > > > should be done by checking if rb_key or cache_ent exist and all other kind of > > > mkeys should be destroyed. > > > > > > Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") > > > Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > > --- > > > drivers/infiniband/hw/mlx5/mr.c | 15 ++++++++------- > > > 1 file changed, 8 insertions(+), 7 deletions(-) > > > > > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c > > > index 12bca6ca4760..3c241898e064 100644 > > > --- a/drivers/infiniband/hw/mlx5/mr.c > > > +++ b/drivers/infiniband/hw/mlx5/mr.c > > > @@ -1857,6 +1857,11 @@ static int cache_ent_find_and_store(struct mlx5_ib_dev *dev, > > > return ret; > > > } > > > > > > +static bool is_cacheable_mkey(struct mlx5_ib_mkey mkey) > > > > I think it's better using a pointer as the parameter instead of the struct itself. > > Indeed, that looks like a typo It is suboptimal to pass struct by value, because whole struct will be copied, but it is not a mistake too. Thanks > > Thanks, > Jason >
On 2024/1/25 21:30, Zhu Yanjun wrote: > 在 2024/1/25 20:52, Junxian Huang 写道: >> >> >> On 2024/1/25 20:30, Leon Romanovsky wrote: >>> From: Or Har-Toov <ohartoov@nvidia.com> >>> >>> In the dereg flow, UMEM is not a good enough indication whether an MR >>> is from userspace since in mlx5_ib_rereg_user_mr there are some cases >>> when a new MR is created and the UMEM of the old MR is set to NULL. >>> Currently when mlx5_ib_dereg_mr is called on the old MR, UMEM is NULL >>> but cache_ent can be different than NULL. So, the mkey will not be >>> destroyed. >>> Therefore checking if mkey is from user application and cacheable >>> should be done by checking if rb_key or cache_ent exist and all other kind of >>> mkeys should be destroyed. >>> >>> Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") >>> Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> >>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> >>> --- >>> drivers/infiniband/hw/mlx5/mr.c | 15 ++++++++------- >>> 1 file changed, 8 insertions(+), 7 deletions(-) >>> >>> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c >>> index 12bca6ca4760..3c241898e064 100644 >>> --- a/drivers/infiniband/hw/mlx5/mr.c >>> +++ b/drivers/infiniband/hw/mlx5/mr.c >>> @@ -1857,6 +1857,11 @@ static int cache_ent_find_and_store(struct mlx5_ib_dev *dev, >>> return ret; >>> } >>> +static bool is_cacheable_mkey(struct mlx5_ib_mkey mkey) >> >> I think it's better using a pointer as the parameter instead of the struct itself. > > Why do you think that a pointer is better that the struct itself? In kernel doc, is there any rule about this? > > Thanks. > > Zhu Yanjun > I don't think there is a rule, but just like what Leon said, it's a suboptimal chioce. Junxian >> >> Junxian >> >>> +{ >>> + return mkey.cache_ent || mkey.rb_key.ndescs; >>> +} >>> + >>> int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) >>> { >>> struct mlx5_ib_mr *mr = to_mmr(ibmr); >>> @@ -1901,12 +1906,6 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) >>> mr->sig = NULL; >>> } >>> - /* Stop DMA */ >>> - if (mr->umem && mlx5r_umr_can_load_pas(dev, mr->umem->length)) >>> - if (mlx5r_umr_revoke_mr(mr) || >>> - cache_ent_find_and_store(dev, mr)) >>> - mr->mmkey.cache_ent = NULL; >>> - >>> if (mr->umem && mr->umem->is_peer) { >>> rc = mlx5r_umr_revoke_mr(mr); >>> if (rc) >>> @@ -1914,7 +1913,9 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) >>> ib_umem_stop_invalidation_notifier(mr->umem); >>> } >>> - if (!mr->mmkey.cache_ent) { >>> + /* Stop DMA */ >>> + if (!is_cacheable_mkey(mr->mmkey) || mlx5r_umr_revoke_mr(mr) || >>> + cache_ent_find_and_store(dev, mr)) { >>> rc = destroy_mkey(to_mdev(mr->ibmr.device), mr); >>> if (rc) >>> return rc; >
在 2024/1/26 4:02, Leon Romanovsky 写道: > On Thu, Jan 25, 2024 at 09:38:24AM -0400, Jason Gunthorpe wrote: >> On Thu, Jan 25, 2024 at 08:52:57PM +0800, Junxian Huang wrote: >>> >>> >>> On 2024/1/25 20:30, Leon Romanovsky wrote: >>>> From: Or Har-Toov <ohartoov@nvidia.com> >>>> >>>> In the dereg flow, UMEM is not a good enough indication whether an MR >>>> is from userspace since in mlx5_ib_rereg_user_mr there are some cases >>>> when a new MR is created and the UMEM of the old MR is set to NULL. >>>> Currently when mlx5_ib_dereg_mr is called on the old MR, UMEM is NULL >>>> but cache_ent can be different than NULL. So, the mkey will not be >>>> destroyed. >>>> Therefore checking if mkey is from user application and cacheable >>>> should be done by checking if rb_key or cache_ent exist and all other kind of >>>> mkeys should be destroyed. >>>> >>>> Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") >>>> Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> >>>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> >>>> --- >>>> drivers/infiniband/hw/mlx5/mr.c | 15 ++++++++------- >>>> 1 file changed, 8 insertions(+), 7 deletions(-) >>>> >>>> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c >>>> index 12bca6ca4760..3c241898e064 100644 >>>> --- a/drivers/infiniband/hw/mlx5/mr.c >>>> +++ b/drivers/infiniband/hw/mlx5/mr.c >>>> @@ -1857,6 +1857,11 @@ static int cache_ent_find_and_store(struct mlx5_ib_dev *dev, >>>> return ret; >>>> } >>>> >>>> +static bool is_cacheable_mkey(struct mlx5_ib_mkey mkey) >>> >>> I think it's better using a pointer as the parameter instead of the struct itself. >> >> Indeed, that looks like a typo > > It is suboptimal to pass struct by value, because whole struct will be copied, Agree. With a pointer, an address is passed. With the struct, the whole struct is copied and passed. The overhead of calling function is less with a pointer. Zhu Yanjun > but it is not a mistake too. > > Thanks > >> >> Thanks, >> Jason >>
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 12bca6ca4760..3c241898e064 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1857,6 +1857,11 @@ static int cache_ent_find_and_store(struct mlx5_ib_dev *dev, return ret; } +static bool is_cacheable_mkey(struct mlx5_ib_mkey mkey) +{ + return mkey.cache_ent || mkey.rb_key.ndescs; +} + int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) { struct mlx5_ib_mr *mr = to_mmr(ibmr); @@ -1901,12 +1906,6 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) mr->sig = NULL; } - /* Stop DMA */ - if (mr->umem && mlx5r_umr_can_load_pas(dev, mr->umem->length)) - if (mlx5r_umr_revoke_mr(mr) || - cache_ent_find_and_store(dev, mr)) - mr->mmkey.cache_ent = NULL; - if (mr->umem && mr->umem->is_peer) { rc = mlx5r_umr_revoke_mr(mr); if (rc) @@ -1914,7 +1913,9 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) ib_umem_stop_invalidation_notifier(mr->umem); } - if (!mr->mmkey.cache_ent) { + /* Stop DMA */ + if (!is_cacheable_mkey(mr->mmkey) || mlx5r_umr_revoke_mr(mr) || + cache_ent_find_and_store(dev, mr)) { rc = destroy_mkey(to_mdev(mr->ibmr.device), mr); if (rc) return rc;