Message ID | 20210720081647.1980-4-yishaih@nvidia.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | Introduce mlx5 user space driver over VFIO | expand |
On Tue, Jul 20, 2021 at 11:16:23AM +0300, Yishai Hadas wrote: > From: Maor Gottlieb <maorg@nvidia.com> > > Usage will be in next patches. > > Signed-off-by: Maor Gottlieb <maorg@nvidia.com> > Signed-off-by: Yishai Hadas <yishaih@nvidia.com> > --- > providers/mlx5/mlx5.c | 28 ++++++++++++++-------------- > providers/mlx5/mlx5.h | 4 ++++ > 2 files changed, 18 insertions(+), 14 deletions(-) Probably, this patch will be needed to be changed after "Verbs logging API" PR https://github.com/linux-rdma/rdma-core/pull/1030 Thanks
On 7/20/2021 11:51 AM, Leon Romanovsky wrote: > On Tue, Jul 20, 2021 at 11:16:23AM +0300, Yishai Hadas wrote: >> From: Maor Gottlieb <maorg@nvidia.com> >> >> Usage will be in next patches. >> >> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> >> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> >> --- >> providers/mlx5/mlx5.c | 28 ++++++++++++++-------------- >> providers/mlx5/mlx5.h | 4 ++++ >> 2 files changed, 18 insertions(+), 14 deletions(-) > Probably, this patch will be needed to be changed after > "Verbs logging API" PR https://github.com/linux-rdma/rdma-core/pull/1030 > > Thanks Well, not really, this patch just reorganizes things inside mlx5 for code sharing. Yishai
On Tue, Jul 20, 2021 at 12:27:46PM +0300, Yishai Hadas wrote: > On 7/20/2021 11:51 AM, Leon Romanovsky wrote: > > On Tue, Jul 20, 2021 at 11:16:23AM +0300, Yishai Hadas wrote: > > > From: Maor Gottlieb <maorg@nvidia.com> > > > > > > Usage will be in next patches. > > > > > > Signed-off-by: Maor Gottlieb <maorg@nvidia.com> > > > Signed-off-by: Yishai Hadas <yishaih@nvidia.com> > > > --- > > > providers/mlx5/mlx5.c | 28 ++++++++++++++-------------- > > > providers/mlx5/mlx5.h | 4 ++++ > > > 2 files changed, 18 insertions(+), 14 deletions(-) > > Probably, this patch will be needed to be changed after > > "Verbs logging API" PR https://github.com/linux-rdma/rdma-core/pull/1030 > > > > Thanks > > Well, not really, this patch just reorganizes things inside mlx5 for code > sharing. After Gal's PR, I expect to see all mlx5 file/debug logic gone except some minimal conversion logic to support old legacy variables. All that get_env_... code will go. Thanks > > Yishai >
On 7/20/2021 3:27 PM, Leon Romanovsky wrote: > On Tue, Jul 20, 2021 at 12:27:46PM +0300, Yishai Hadas wrote: >> On 7/20/2021 11:51 AM, Leon Romanovsky wrote: >>> On Tue, Jul 20, 2021 at 11:16:23AM +0300, Yishai Hadas wrote: >>>> From: Maor Gottlieb <maorg@nvidia.com> >>>> >>>> Usage will be in next patches. >>>> >>>> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> >>>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> >>>> --- >>>> providers/mlx5/mlx5.c | 28 ++++++++++++++-------------- >>>> providers/mlx5/mlx5.h | 4 ++++ >>>> 2 files changed, 18 insertions(+), 14 deletions(-) >>> Probably, this patch will be needed to be changed after >>> "Verbs logging API" PR https://github.com/linux-rdma/rdma-core/pull/1030 >>> >>> Thanks >> Well, not really, this patch just reorganizes things inside mlx5 for code >> sharing. > After Gal's PR, I expect to see all mlx5 file/debug logic gone except > some minimal conversion logic to support old legacy variables. > > All that get_env_... code will go. > > Thanks > Looking on current VERBs logging PR, it doesn't give a clean path conversion from mlx5. For example it missed a debug granularity per object (e.g. QP, CQ, etc.) , in addition it doesn't define a driver specific options as we have today in mlx5 (e.g. MLX5_DBG_DR). I believe that this should be added before going forward with the logging PR to enable a clean transition later on. The transition of mlx5 should preserve current API/semantics (ENV, etc.) and is an orthogonal task. Yishai
On 20/07/2021 17:57, Yishai Hadas wrote: > On 7/20/2021 3:27 PM, Leon Romanovsky wrote: >> On Tue, Jul 20, 2021 at 12:27:46PM +0300, Yishai Hadas wrote: >>> On 7/20/2021 11:51 AM, Leon Romanovsky wrote: >>>> On Tue, Jul 20, 2021 at 11:16:23AM +0300, Yishai Hadas wrote: >>>>> From: Maor Gottlieb <maorg@nvidia.com> >>>>> >>>>> Usage will be in next patches. >>>>> >>>>> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> >>>>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> >>>>> --- >>>>> providers/mlx5/mlx5.c | 28 ++++++++++++++-------------- >>>>> providers/mlx5/mlx5.h | 4 ++++ >>>>> 2 files changed, 18 insertions(+), 14 deletions(-) >>>> Probably, this patch will be needed to be changed after >>>> "Verbs logging API" PR https://github.com/linux-rdma/rdma-core/pull/1030 >>>> >>>> Thanks >>> Well, not really, this patch just reorganizes things inside mlx5 for code >>> sharing. >> After Gal's PR, I expect to see all mlx5 file/debug logic gone except >> some minimal conversion logic to support old legacy variables. >> >> All that get_env_... code will go. >> >> Thanks >> > Looking on current VERBs logging PR, it doesn't give a clean path conversion > from mlx5. > > For example it missed a debug granularity per object (e.g. QP, CQ, etc.) , in > addition it doesn't define a driver specific options as we have today in mlx5 > (e.g. MLX5_DBG_DR). > > I believe that this should be added before going forward with the logging PR to > enable a clean transition later on. > > The transition of mlx5 should preserve current API/semantics (ENV, etc.) and is > an orthogonal task. Yishai, if you have any more concerns please share them in the PR.. The discussion there is going on for a while and you've been quiet so I assumed you're fine with it. I disagree about needing to support everything that exists in mlx5 today, the purpose of the generic mechanism is to unify the environment variables, driver specific options do the opposite. IMO masking out a few prints isn't worth the divergence. If the mlx5 provider has backwards compatibility issues it doesn't necessarily have to use this API, but we can at least covert most existing providers and all future providers that wish to support such functionality in a unified way. BTW, why even insist on having backwards compatibility here? Do you actually have useful code that relies on debug environment variables?
On 7/21/2021 10:05 AM, Gal Pressman wrote: > On 20/07/2021 17:57, Yishai Hadas wrote: >> On 7/20/2021 3:27 PM, Leon Romanovsky wrote: >>> On Tue, Jul 20, 2021 at 12:27:46PM +0300, Yishai Hadas wrote: >>>> On 7/20/2021 11:51 AM, Leon Romanovsky wrote: >>>>> On Tue, Jul 20, 2021 at 11:16:23AM +0300, Yishai Hadas wrote: >>>>>> From: Maor Gottlieb <maorg@nvidia.com> >>>>>> >>>>>> Usage will be in next patches. >>>>>> >>>>>> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> >>>>>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> >>>>>> --- >>>>>> providers/mlx5/mlx5.c | 28 ++++++++++++++-------------- >>>>>> providers/mlx5/mlx5.h | 4 ++++ >>>>>> 2 files changed, 18 insertions(+), 14 deletions(-) >>>>> Probably, this patch will be needed to be changed after >>>>> "Verbs logging API" PR https://github.com/linux-rdma/rdma-core/pull/1030 >>>>> >>>>> Thanks >>>> Well, not really, this patch just reorganizes things inside mlx5 for code >>>> sharing. >>> After Gal's PR, I expect to see all mlx5 file/debug logic gone except >>> some minimal conversion logic to support old legacy variables. >>> >>> All that get_env_... code will go. >>> >>> Thanks >>> >> Looking on current VERBs logging PR, it doesn't give a clean path conversion >> from mlx5. >> >> For example it missed a debug granularity per object (e.g. QP, CQ, etc.) , in >> addition it doesn't define a driver specific options as we have today in mlx5 >> (e.g. MLX5_DBG_DR). >> >> I believe that this should be added before going forward with the logging PR to >> enable a clean transition later on. >> >> The transition of mlx5 should preserve current API/semantics (ENV, etc.) and is >> an orthogonal task. > Yishai, if you have any more concerns please share them in the PR.. The > discussion there is going on for a while and you've been quiet so I assumed > you're fine with it. > > I disagree about needing to support everything that exists in mlx5 today, the > purpose of the generic mechanism is to unify the environment variables, driver > specific options do the opposite. IMO masking out a few prints isn't worth the > divergence. The options in mlx5 gives more granularity and supports vendor specific options, this may be needed down the road by other vendors as well. If we come with a new API need to consider such options in the general case. NP, I'll comment in the logging PR as well. > > If the mlx5 provider has backwards compatibility issues it doesn't necessarily > have to use this API, but we can at least covert most existing providers and all > future providers that wish to support such functionality in a unified way. The point was that current suggestion doesn't allow a clean transition for mlx5, so we won't use it unless the API will give a matching alternative. > BTW, why even insist on having backwards compatibility here? Do you actually > have useful code that relies on debug environment variables? Logging options (env, mask, etc.) are kind of API which need to be preserved, it's used in the field for debug purposes, no reason to loose granularity and trace. Yishai
On 21/07/2021 10:58, Yishai Hadas wrote: > On 7/21/2021 10:05 AM, Gal Pressman wrote: >> On 20/07/2021 17:57, Yishai Hadas wrote: >>> On 7/20/2021 3:27 PM, Leon Romanovsky wrote: >>>> On Tue, Jul 20, 2021 at 12:27:46PM +0300, Yishai Hadas wrote: >>>>> On 7/20/2021 11:51 AM, Leon Romanovsky wrote: >>>>>> On Tue, Jul 20, 2021 at 11:16:23AM +0300, Yishai Hadas wrote: >>>>>>> From: Maor Gottlieb <maorg@nvidia.com> >>>>>>> >>>>>>> Usage will be in next patches. >>>>>>> >>>>>>> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> >>>>>>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> >>>>>>> --- >>>>>>> providers/mlx5/mlx5.c | 28 ++++++++++++++-------------- >>>>>>> providers/mlx5/mlx5.h | 4 ++++ >>>>>>> 2 files changed, 18 insertions(+), 14 deletions(-) >>>>>> Probably, this patch will be needed to be changed after >>>>>> "Verbs logging API" PR https://github.com/linux-rdma/rdma-core/pull/1030 >>>>>> >>>>>> Thanks >>>>> Well, not really, this patch just reorganizes things inside mlx5 for code >>>>> sharing. >>>> After Gal's PR, I expect to see all mlx5 file/debug logic gone except >>>> some minimal conversion logic to support old legacy variables. >>>> >>>> All that get_env_... code will go. >>>> >>>> Thanks >>>> >>> Looking on current VERBs logging PR, it doesn't give a clean path conversion >>> from mlx5. >>> >>> For example it missed a debug granularity per object (e.g. QP, CQ, etc.) , in >>> addition it doesn't define a driver specific options as we have today in mlx5 >>> (e.g. MLX5_DBG_DR). >>> >>> I believe that this should be added before going forward with the logging PR to >>> enable a clean transition later on. >>> >>> The transition of mlx5 should preserve current API/semantics (ENV, etc.) and is >>> an orthogonal task. >> Yishai, if you have any more concerns please share them in the PR.. The >> discussion there is going on for a while and you've been quiet so I assumed >> you're fine with it. >> >> I disagree about needing to support everything that exists in mlx5 today, the >> purpose of the generic mechanism is to unify the environment variables, driver >> specific options do the opposite. IMO masking out a few prints isn't worth the >> divergence. > > > The options in mlx5 gives more granularity and supports vendor specific options, > this may be needed down the road by other vendors as well. > > If we come with a new API need to consider such options in the general case. > > NP, I'll comment in the logging PR as well. > >> >> If the mlx5 provider has backwards compatibility issues it doesn't necessarily >> have to use this API, but we can at least covert most existing providers and all >> future providers that wish to support such functionality in a unified way. > > > The point was that current suggestion doesn't allow a clean transition for mlx5, > so we won't use it unless the API will give a matching alternative. > >> BTW, why even insist on having backwards compatibility here? Do you actually >> have useful code that relies on debug environment variables? > > Logging options (env, mask, etc.) are kind of API which need to be preserved, > it's used in the field for debug purposes, no reason to loose granularity and > trace. It's used in the field by *people*, which unlike legacy code can learn to use the new debug env variables, they don't need backwards compatibility (the same as users of debugfs). I addressed the granularity issue in the PR.
diff --git a/providers/mlx5/mlx5.c b/providers/mlx5/mlx5.c index 46d7748..1abaa8c 100644 --- a/providers/mlx5/mlx5.c +++ b/providers/mlx5/mlx5.c @@ -583,7 +583,7 @@ static int get_total_uuars(int page_size) return size; } -static void open_debug_file(struct mlx5_context *ctx) +void mlx5_open_debug_file(FILE **dbg_fp) { char *env; FILE *default_dbg_fp = NULL; @@ -594,25 +594,25 @@ static void open_debug_file(struct mlx5_context *ctx) env = getenv("MLX5_DEBUG_FILE"); if (!env) { - ctx->dbg_fp = default_dbg_fp; + *dbg_fp = default_dbg_fp; return; } - ctx->dbg_fp = fopen(env, "aw+"); - if (!ctx->dbg_fp) { - ctx->dbg_fp = default_dbg_fp; - mlx5_err(ctx->dbg_fp, "Failed opening debug file %s\n", env); + *dbg_fp = fopen(env, "aw+"); + if (!*dbg_fp) { + *dbg_fp = default_dbg_fp; + mlx5_err(*dbg_fp, "Failed opening debug file %s\n", env); return; } } -static void close_debug_file(struct mlx5_context *ctx) +void mlx5_close_debug_file(FILE *dbg_fp) { - if (ctx->dbg_fp && ctx->dbg_fp != stderr) - fclose(ctx->dbg_fp); + if (dbg_fp && dbg_fp != stderr) + fclose(dbg_fp); } -static void set_debug_mask(void) +void mlx5_set_debug_mask(void) { char *env; @@ -2036,7 +2036,7 @@ static int get_uar_info(struct mlx5_device *mdev, static void mlx5_uninit_context(struct mlx5_context *context) { - close_debug_file(context); + mlx5_close_debug_file(context->dbg_fp); verbs_uninit_context(&context->ibv_ctx); free(context); @@ -2056,8 +2056,8 @@ static struct mlx5_context *mlx5_init_context(struct ibv_device *ibdev, if (!context) return NULL; - open_debug_file(context); - set_debug_mask(); + mlx5_open_debug_file(&context->dbg_fp); + mlx5_set_debug_mask(); set_freeze_on_error(); if (gethostname(context->hostname, sizeof(context->hostname))) strcpy(context->hostname, "host_unknown"); @@ -2377,7 +2377,7 @@ static void mlx5_free_context(struct ibv_context *ibctx) page_size); if (context->clock_info_page) munmap((void *)context->clock_info_page, page_size); - close_debug_file(context); + mlx5_close_debug_file(context->dbg_fp); clean_dyn_uars(ibctx); reserved_qpn_blks_free(context); diff --git a/providers/mlx5/mlx5.h b/providers/mlx5/mlx5.h index 3862007..7436bc8 100644 --- a/providers/mlx5/mlx5.h +++ b/providers/mlx5/mlx5.h @@ -992,6 +992,10 @@ static inline struct mlx5_flow *to_mflow(struct ibv_flow *flow_id) bool is_mlx5_dev(struct ibv_device *device); +void mlx5_open_debug_file(FILE **dbg_fp); +void mlx5_close_debug_file(FILE *dbg_fp); +void mlx5_set_debug_mask(void); + int mlx5_alloc_buf(struct mlx5_buf *buf, size_t size, int page_size); void mlx5_free_buf(struct mlx5_buf *buf); int mlx5_alloc_buf_contig(struct mlx5_context *mctx, struct mlx5_buf *buf,