Message ID | 20231225032117.7493-3-chengyou@linux.alibaba.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | RDMA/erdma: Introduce hardware statistics support | expand |
Hi Cheng, kernel test robot noticed the following build warnings: [auto build test WARNING on rdma/for-next] [also build test WARNING on linus/master v6.7-rc7 next-20231222] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Cheng-Xu/RDMA-erdma-Introduce-dma-pool-for-hardware-responses-of-CMDQ-requests/20231225-154653 base: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next patch link: https://lore.kernel.org/r/20231225032117.7493-3-chengyou%40linux.alibaba.com patch subject: [PATCH for-next v2 2/2] RDMA/erdma: Add hardware statistics support config: s390-allyesconfig (https://download.01.org/0day-ci/archive/20231226/202312260550.9DPkrw52-lkp@intel.com/config) compiler: s390-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231226/202312260550.9DPkrw52-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202312260550.9DPkrw52-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/infiniband/hw/erdma/erdma_verbs.c:1750:5: warning: no previous prototype for 'erdma_query_hw_stats' [-Wmissing-prototypes] 1750 | int erdma_query_hw_stats(struct erdma_dev *dev, struct rdma_hw_stats *stats) | ^~~~~~~~~~~~~~~~~~~~ vim +/erdma_query_hw_stats +1750 drivers/infiniband/hw/erdma/erdma_verbs.c 1749 > 1750 int erdma_query_hw_stats(struct erdma_dev *dev, struct rdma_hw_stats *stats) 1751 { 1752 struct erdma_cmdq_query_stats_resp *resp; 1753 struct erdma_cmdq_query_req req; 1754 dma_addr_t dma_addr; 1755 int err; 1756 1757 erdma_cmdq_build_reqhdr(&req.hdr, CMDQ_SUBMOD_COMMON, 1758 CMDQ_OPCODE_GET_STATS); 1759 1760 resp = dma_pool_zalloc(dev->resp_pool, GFP_KERNEL, &dma_addr); 1761 if (!resp) 1762 return -ENOMEM; 1763 1764 req.target_addr = dma_addr; 1765 req.target_length = ERDMA_HW_RESP_SIZE; 1766 1767 err = erdma_post_cmd_wait(&dev->cmdq, &req, sizeof(req), NULL, NULL); 1768 if (err) 1769 goto out; 1770 1771 if (resp->hdr.magic != ERDMA_HW_RESP_MAGIC) { 1772 err = -EINVAL; 1773 goto out; 1774 } 1775 1776 memcpy(&stats->value[0], &resp->tx_req_cnt, 1777 sizeof(u64) * stats->num_counters); 1778 1779 out: 1780 dma_pool_free(dev->resp_pool, resp, dma_addr); 1781 1782 return err; 1783 } 1784
Hi Cheng, kernel test robot noticed the following build warnings: [auto build test WARNING on rdma/for-next] [also build test WARNING on linus/master v6.7-rc7 next-20231222] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Cheng-Xu/RDMA-erdma-Introduce-dma-pool-for-hardware-responses-of-CMDQ-requests/20231225-154653 base: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next patch link: https://lore.kernel.org/r/20231225032117.7493-3-chengyou%40linux.alibaba.com patch subject: [PATCH for-next v2 2/2] RDMA/erdma: Add hardware statistics support config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20231226/202312260724.2RjYLbxV-lkp@intel.com/config) compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231226/202312260724.2RjYLbxV-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202312260724.2RjYLbxV-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/infiniband/hw/erdma/erdma_verbs.c:1750:5: warning: no previous prototype for function 'erdma_query_hw_stats' [-Wmissing-prototypes] int erdma_query_hw_stats(struct erdma_dev *dev, struct rdma_hw_stats *stats) ^ drivers/infiniband/hw/erdma/erdma_verbs.c:1750:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int erdma_query_hw_stats(struct erdma_dev *dev, struct rdma_hw_stats *stats) ^ static 1 warning generated. vim +/erdma_query_hw_stats +1750 drivers/infiniband/hw/erdma/erdma_verbs.c 1749 > 1750 int erdma_query_hw_stats(struct erdma_dev *dev, struct rdma_hw_stats *stats) 1751 { 1752 struct erdma_cmdq_query_stats_resp *resp; 1753 struct erdma_cmdq_query_req req; 1754 dma_addr_t dma_addr; 1755 int err; 1756 1757 erdma_cmdq_build_reqhdr(&req.hdr, CMDQ_SUBMOD_COMMON, 1758 CMDQ_OPCODE_GET_STATS); 1759 1760 resp = dma_pool_zalloc(dev->resp_pool, GFP_KERNEL, &dma_addr); 1761 if (!resp) 1762 return -ENOMEM; 1763 1764 req.target_addr = dma_addr; 1765 req.target_length = ERDMA_HW_RESP_SIZE; 1766 1767 err = erdma_post_cmd_wait(&dev->cmdq, &req, sizeof(req), NULL, NULL); 1768 if (err) 1769 goto out; 1770 1771 if (resp->hdr.magic != ERDMA_HW_RESP_MAGIC) { 1772 err = -EINVAL; 1773 goto out; 1774 } 1775 1776 memcpy(&stats->value[0], &resp->tx_req_cnt, 1777 sizeof(u64) * stats->num_counters); 1778 1779 out: 1780 dma_pool_free(dev->resp_pool, resp, dma_addr); 1781 1782 return err; 1783 } 1784
在 2023/12/26 6:09, kernel test robot 写道: > Hi Cheng, > > kernel test robot noticed the following build warnings: > > [auto build test WARNING on rdma/for-next] > [also build test WARNING on linus/master v6.7-rc7 next-20231222] > [If your patch is applied to the wrong git tree, kindly drop us a note. > And when submitting patch, we suggest to use '--base' as documented in > https://git-scm.com/docs/git-format-patch#_base_tree_information] > > url: https://github.com/intel-lab-lkp/linux/commits/Cheng-Xu/RDMA-erdma-Introduce-dma-pool-for-hardware-responses-of-CMDQ-requests/20231225-154653 > base: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next > patch link: https://lore.kernel.org/r/20231225032117.7493-3-chengyou%40linux.alibaba.com > patch subject: [PATCH for-next v2 2/2] RDMA/erdma: Add hardware statistics support > config: s390-allyesconfig (https://download.01.org/0day-ci/archive/20231226/202312260550.9DPkrw52-lkp@intel.com/config) > compiler: s390-linux-gcc (GCC) 13.2.0 > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231226/202312260550.9DPkrw52-lkp@intel.com/reproduce) > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <lkp@intel.com> > | Closes: https://lore.kernel.org/oe-kbuild-all/202312260550.9DPkrw52-lkp@intel.com/ > > All warnings (new ones prefixed by >>): > >>> drivers/infiniband/hw/erdma/erdma_verbs.c:1750:5: warning: no previous prototype for 'erdma_query_hw_stats' [-Wmissing-prototypes] > 1750 | int erdma_query_hw_stats(struct erdma_dev *dev, struct rdma_hw_stats *stats) > | ^~~~~~~~~~~~~~~~~~~~ > Prepending "static" can fix this problem. diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c index e47e158bedd5..de534651658d 100644 --- a/drivers/infiniband/hw/erdma/erdma_verbs.c +++ b/drivers/infiniband/hw/erdma/erdma_verbs.c @@ -1747,7 +1747,7 @@ struct rdma_hw_stats *erdma_alloc_hw_port_stats(struct ib_device *device, RDMA_HW_STATS_DEFAULT_LIFESPAN); } -int erdma_query_hw_stats(struct erdma_dev *dev, struct rdma_hw_stats *stats) +static int erdma_query_hw_stats(struct erdma_dev *dev, struct rdma_hw_stats *stats) { struct erdma_cmdq_query_stats_resp *resp; struct erdma_cmdq_query_req req; > > vim +/erdma_query_hw_stats +1750 drivers/infiniband/hw/erdma/erdma_verbs.c > > 1749 >> 1750 int erdma_query_hw_stats(struct erdma_dev *dev, struct rdma_hw_stats *stats) > 1751 { > 1752 struct erdma_cmdq_query_stats_resp *resp; > 1753 struct erdma_cmdq_query_req req; > 1754 dma_addr_t dma_addr; > 1755 int err; > 1756 > 1757 erdma_cmdq_build_reqhdr(&req.hdr, CMDQ_SUBMOD_COMMON, > 1758 CMDQ_OPCODE_GET_STATS); > 1759 > 1760 resp = dma_pool_zalloc(dev->resp_pool, GFP_KERNEL, &dma_addr); > 1761 if (!resp) > 1762 return -ENOMEM; > 1763 > 1764 req.target_addr = dma_addr; > 1765 req.target_length = ERDMA_HW_RESP_SIZE; > 1766 > 1767 err = erdma_post_cmd_wait(&dev->cmdq, &req, sizeof(req), NULL, NULL); > 1768 if (err) > 1769 goto out; > 1770 > 1771 if (resp->hdr.magic != ERDMA_HW_RESP_MAGIC) { > 1772 err = -EINVAL; > 1773 goto out; > 1774 } > 1775 > 1776 memcpy(&stats->value[0], &resp->tx_req_cnt, > 1777 sizeof(u64) * stats->num_counters); > 1778 > 1779 out: > 1780 dma_pool_free(dev->resp_pool, resp, dma_addr); > 1781 > 1782 return err; > 1783 } > 1784 >
On 12/26/23 10:35 AM, Zhu Yanjun wrote: > 在 2023/12/26 6:09, kernel test robot 写道: >> Hi Cheng, >> >> kernel test robot noticed the following build warnings: >> >> [auto build test WARNING on rdma/for-next] >> [also build test WARNING on linus/master v6.7-rc7 next-20231222] >> [If your patch is applied to the wrong git tree, kindly drop us a note. >> And when submitting patch, we suggest to use '--base' as documented in >> https://git-scm.com/docs/git-format-patch#_base_tree_information] >> >> url: https://github.com/intel-lab-lkp/linux/commits/Cheng-Xu/RDMA-erdma-Introduce-dma-pool-for-hardware-responses-of-CMDQ-requests/20231225-154653 >> base: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next >> patch link: https://lore.kernel.org/r/20231225032117.7493-3-chengyou%40linux.alibaba.com >> patch subject: [PATCH for-next v2 2/2] RDMA/erdma: Add hardware statistics support >> config: s390-allyesconfig (https://download.01.org/0day-ci/archive/20231226/202312260550.9DPkrw52-lkp@intel.com/config) >> compiler: s390-linux-gcc (GCC) 13.2.0 >> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231226/202312260550.9DPkrw52-lkp@intel.com/reproduce) >> >> If you fix the issue in a separate patch/commit (i.e. not just a new version of >> the same patch/commit), kindly add following tags >> | Reported-by: kernel test robot <lkp@intel.com> >> | Closes: https://lore.kernel.org/oe-kbuild-all/202312260550.9DPkrw52-lkp@intel.com/ >> >> All warnings (new ones prefixed by >>): >> >>>> drivers/infiniband/hw/erdma/erdma_verbs.c:1750:5: warning: no previous prototype for 'erdma_query_hw_stats' [-Wmissing-prototypes] >> 1750 | int erdma_query_hw_stats(struct erdma_dev *dev, struct rdma_hw_stats *stats) >> | ^~~~~~~~~~~~~~~~~~~~ >> > > Prepending "static" can fix this problem. > You are right, thanks for your suggestion. Cheng Xu
diff --git a/drivers/infiniband/hw/erdma/erdma_hw.h b/drivers/infiniband/hw/erdma/erdma_hw.h index 4baabf1f2b08..ed8e3940cf4c 100644 --- a/drivers/infiniband/hw/erdma/erdma_hw.h +++ b/drivers/infiniband/hw/erdma/erdma_hw.h @@ -146,6 +146,7 @@ enum CMDQ_COMMON_OPCODE { CMDQ_OPCODE_DESTROY_EQ = 1, CMDQ_OPCODE_QUERY_FW_INFO = 2, CMDQ_OPCODE_CONF_MTU = 3, + CMDQ_OPCODE_GET_STATS = 4, CMDQ_OPCODE_CONF_DEVICE = 5, CMDQ_OPCODE_ALLOC_DB = 8, CMDQ_OPCODE_FREE_DB = 9, @@ -359,6 +360,42 @@ struct erdma_cmdq_reflush_req { #define ERDMA_HW_RESP_SIZE 256 +struct erdma_cmdq_query_req { + u64 hdr; + u32 rsvd; + u32 index; + + u64 target_addr; + u32 target_length; +}; + +#define ERDMA_HW_RESP_MAGIC 0x5566 + +struct erdma_cmdq_query_resp_hdr { + u16 magic; + u8 ver; + u8 length; + + u32 index; + u32 rsvd[2]; +}; + +struct erdma_cmdq_query_stats_resp { + struct erdma_cmdq_query_resp_hdr hdr; + + u64 tx_req_cnt; + u64 tx_packets_cnt; + u64 tx_bytes_cnt; + u64 tx_drop_packets_cnt; + u64 tx_bps_meter_drop_packets_cnt; + u64 tx_pps_meter_drop_packets_cnt; + u64 rx_packets_cnt; + u64 rx_bytes_cnt; + u64 rx_drop_packets_cnt; + u64 rx_bps_meter_drop_packets_cnt; + u64 rx_pps_meter_drop_packets_cnt; +}; + /* cap qword 0 definition */ #define ERDMA_CMD_DEV_CAP_MAX_CQE_MASK GENMASK_ULL(47, 40) #define ERDMA_CMD_DEV_CAP_FLAGS_MASK GENMASK_ULL(31, 24) diff --git a/drivers/infiniband/hw/erdma/erdma_main.c b/drivers/infiniband/hw/erdma/erdma_main.c index e4df5bf89cd0..472939172f0c 100644 --- a/drivers/infiniband/hw/erdma/erdma_main.c +++ b/drivers/infiniband/hw/erdma/erdma_main.c @@ -468,6 +468,7 @@ static const struct ib_device_ops erdma_device_ops = { .driver_id = RDMA_DRIVER_ERDMA, .uverbs_abi_ver = ERDMA_ABI_VERSION, + .alloc_hw_port_stats = erdma_alloc_hw_port_stats, .alloc_mr = erdma_ib_alloc_mr, .alloc_pd = erdma_alloc_pd, .alloc_ucontext = erdma_alloc_ucontext, @@ -479,6 +480,7 @@ static const struct ib_device_ops erdma_device_ops = { .destroy_cq = erdma_destroy_cq, .destroy_qp = erdma_destroy_qp, .get_dma_mr = erdma_get_dma_mr, + .get_hw_stats = erdma_get_hw_stats, .get_port_immutable = erdma_get_port_immutable, .iw_accept = erdma_accept, .iw_add_ref = erdma_qp_get_ref, diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c index c317947563fb..e47e158bedd5 100644 --- a/drivers/infiniband/hw/erdma/erdma_verbs.c +++ b/drivers/infiniband/hw/erdma/erdma_verbs.c @@ -1708,3 +1708,92 @@ void erdma_port_event(struct erdma_dev *dev, enum ib_event_type reason) ib_dispatch_event(&event); } + +enum counters { + ERDMA_STATS_TX_REQS_CNT, + ERDMA_STATS_TX_PACKETS_CNT, + ERDMA_STATS_TX_BYTES_CNT, + ERDMA_STATS_TX_DISABLE_DROP_CNT, + ERDMA_STATS_TX_BPS_METER_DROP_CNT, + ERDMA_STATS_TX_PPS_METER_DROP_CNT, + + ERDMA_STATS_RX_PACKETS_CNT, + ERDMA_STATS_RX_BYTES_CNT, + ERDMA_STATS_RX_DISABLE_DROP_CNT, + ERDMA_STATS_RX_BPS_METER_DROP_CNT, + ERDMA_STATS_RX_PPS_METER_DROP_CNT, + + ERDMA_STATS_MAX +}; + +static const struct rdma_stat_desc erdma_descs[] = { + [ERDMA_STATS_TX_REQS_CNT].name = "tx_reqs_cnt", + [ERDMA_STATS_TX_PACKETS_CNT].name = "tx_packets_cnt", + [ERDMA_STATS_TX_BYTES_CNT].name = "tx_bytes_cnt", + [ERDMA_STATS_TX_DISABLE_DROP_CNT].name = "tx_disable_drop_cnt", + [ERDMA_STATS_TX_BPS_METER_DROP_CNT].name = "tx_bps_limit_drop_cnt", + [ERDMA_STATS_TX_PPS_METER_DROP_CNT].name = "tx_pps_limit_drop_cnt", + [ERDMA_STATS_RX_PACKETS_CNT].name = "rx_packets_cnt", + [ERDMA_STATS_RX_BYTES_CNT].name = "rx_bytes_cnt", + [ERDMA_STATS_RX_DISABLE_DROP_CNT].name = "rx_disable_drop_cnt", + [ERDMA_STATS_RX_BPS_METER_DROP_CNT].name = "rx_bps_limit_drop_cnt", + [ERDMA_STATS_RX_PPS_METER_DROP_CNT].name = "rx_pps_limit_drop_cnt", +}; + +struct rdma_hw_stats *erdma_alloc_hw_port_stats(struct ib_device *device, + u32 port_num) +{ + return rdma_alloc_hw_stats_struct(erdma_descs, ERDMA_STATS_MAX, + RDMA_HW_STATS_DEFAULT_LIFESPAN); +} + +int erdma_query_hw_stats(struct erdma_dev *dev, struct rdma_hw_stats *stats) +{ + struct erdma_cmdq_query_stats_resp *resp; + struct erdma_cmdq_query_req req; + dma_addr_t dma_addr; + int err; + + erdma_cmdq_build_reqhdr(&req.hdr, CMDQ_SUBMOD_COMMON, + CMDQ_OPCODE_GET_STATS); + + resp = dma_pool_zalloc(dev->resp_pool, GFP_KERNEL, &dma_addr); + if (!resp) + return -ENOMEM; + + req.target_addr = dma_addr; + req.target_length = ERDMA_HW_RESP_SIZE; + + err = erdma_post_cmd_wait(&dev->cmdq, &req, sizeof(req), NULL, NULL); + if (err) + goto out; + + if (resp->hdr.magic != ERDMA_HW_RESP_MAGIC) { + err = -EINVAL; + goto out; + } + + memcpy(&stats->value[0], &resp->tx_req_cnt, + sizeof(u64) * stats->num_counters); + +out: + dma_pool_free(dev->resp_pool, resp, dma_addr); + + return err; +} + +int erdma_get_hw_stats(struct ib_device *ibdev, struct rdma_hw_stats *stats, + u32 port, int index) +{ + struct erdma_dev *dev = to_edev(ibdev); + int ret; + + if (port == 0) + return 0; + + ret = erdma_query_hw_stats(dev, stats); + if (ret) + return ret; + + return stats->num_counters; +} diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.h b/drivers/infiniband/hw/erdma/erdma_verbs.h index eb9c0f92fb6f..db6018529ccc 100644 --- a/drivers/infiniband/hw/erdma/erdma_verbs.h +++ b/drivers/infiniband/hw/erdma/erdma_verbs.h @@ -361,5 +361,9 @@ int erdma_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents, unsigned int *sg_offset); void erdma_port_event(struct erdma_dev *dev, enum ib_event_type reason); void erdma_set_mtu(struct erdma_dev *dev, u32 mtu); +struct rdma_hw_stats *erdma_alloc_hw_port_stats(struct ib_device *device, + u32 port_num); +int erdma_get_hw_stats(struct ib_device *ibdev, struct rdma_hw_stats *stats, + u32 port, int index); #endif
First, we add a new command to query hardware statistics, and then implement two functions: ib_device_ops.alloc_hw_port_stats and ib_device_ops.get_hw_stats to allow rdma tool can get the statistics of erdma device. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> --- drivers/infiniband/hw/erdma/erdma_hw.h | 37 ++++++++++ drivers/infiniband/hw/erdma/erdma_main.c | 2 + drivers/infiniband/hw/erdma/erdma_verbs.c | 89 +++++++++++++++++++++++ drivers/infiniband/hw/erdma/erdma_verbs.h | 4 + 4 files changed, 132 insertions(+)