Message ID | cover.1640862842.git.leonro@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | MR cache enhancement | expand |
On Sun, Jan 02, 2022 at 11:03:10AM +0800, Hillf Danton wrote: > On Thu, 30 Dec 2021 13:23:23 +0200 > > From: Aharon Landau <aharonl@nvidia.com> > > > > When restarting an application with many non-cached mkeys, all the mkeys > > will be destroyed and then recreated. > > > > This process takes a long time (about 20 seconds for deregistration and > > 28 seconds for registration of 100,000 MRs). > > > > To shorten the restart runtime, insert the mkeys temporarily into the > > cache and schedule a delayed work to destroy them later. If there is no > > fitting entry to these mkeys, create a temporary entry that fits them. > > > > If 30 seconds have passed and no user reclaimed the temporarily cached > > mkeys, the scheduled work will destroy the mkeys and the temporary > > entries. > > > > When restarting an application, the mkeys will still be in the cache > > when trying to reg them again, therefore, the registration will be > > faster (4 seconds for deregistration and 5 seconds or registration of > > 100,000 MRs). > > > > Signed-off-by: Aharon Landau <aharonl@nvidia.com> > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > --- > > drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 + > > drivers/infiniband/hw/mlx5/mr.c | 131 ++++++++++++++++++++++++++- > > 2 files changed, 132 insertions(+), 2 deletions(-) <...> > > + if (!ent->is_tmp) > > + mr->mmkey.cache_ent = ent; > > + else { > > + ent->total_mrs--; > > + cancel_delayed_work(&ent->dev->cache.remove_ent_dwork); > > + queue_delayed_work(ent->dev->cache.wq, > > + &ent->dev->cache.remove_ent_dwork, > > + msecs_to_jiffies(30 * 1000)); > > + } > > Nit: collapse cancel and queue into mod_delayed_work(). > > > } <...> > > + INIT_WORK(&ent->work, cache_work_func); > > + INIT_DELAYED_WORK(&ent->dwork, delayed_cache_work_func); > > More important IMHO is to cut work in a seperate patch given that dwork can > be queued with zero delay and both work callbacks are simple wrappers of > __cache_work_func(). Thanks, I'll collect more feedback and resubmit. > > Hillf
From: Leon Romanovsky <leonro@nvidia.com> Changelog: v1: * Based on DM revert https://lore.kernel.org/all/20211222101312.1358616-1-maorg@nvidia.com v0: https://lore.kernel.org/all/cover.1638781506.git.leonro@nvidia.com --------------------------------------------------------- Hi, This series from Aharon refactors mlx5 MR cache management logic to speedup deregistration significantly. Thanks Aharon Landau (7): RDMA/mlx5: Merge similar flows of allocating MR from the cache RDMA/mlx5: Replace cache list with Xarray RDMA/mlx5: Store in the cache mkeys instead of mrs RDMA/mlx5: Reorder calls to pcie_relaxed_ordering_enabled() RDMA/mlx5: Change the cache structure to an RB-tree RDMA/mlx5: Delay the deregistration of a non-cache mkey RDMA/mlx5: Rename the mkey cache variables and functions drivers/infiniband/hw/mlx5/main.c | 4 +- drivers/infiniband/hw/mlx5/mlx5_ib.h | 76 +- drivers/infiniband/hw/mlx5/mr.c | 1021 +++++++++++++++++--------- drivers/infiniband/hw/mlx5/odp.c | 72 +- include/linux/mlx5/driver.h | 7 +- 5 files changed, 741 insertions(+), 439 deletions(-)