Message ID | 20240111120036.109903-10-guwen@linux.alibaba.com (mailing list archive) |
---|---|
State | Deferred |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net/smc: implement loopback-ism used by SMC-D | expand |
On 11.01.24 13:00, Wen Gu wrote: > This introduces some statistics attributes of loopback-ism. They can be > read from /sys/devices/virtual/smc/loopback-ism/{xfer_tytes|dmbs_cnt}. > > Signed-off-by: Wen Gu <guwen@linux.alibaba.com> > --- > net/smc/smc_loopback.c | 74 ++++++++++++++++++++++++++++++++++++++++++ > net/smc/smc_loopback.h | 22 +++++++++++++ > 2 files changed, 96 insertions(+) > I've read the comments from Jiri and your answer. I can understand your thought. However, from the perspective of the end user, it makes more sense to integetrate the stats info into 'smcd stats'. Otherwise, it would make users confused to find out with which tool to check which statisic infornation. Sure, some improvement of the smc-tools is also needed
On 2024/2/16 22:24, Wenjia Zhang wrote: > > > On 11.01.24 13:00, Wen Gu wrote: >> This introduces some statistics attributes of loopback-ism. They can be >> read from /sys/devices/virtual/smc/loopback-ism/{xfer_tytes|dmbs_cnt}. >> >> Signed-off-by: Wen Gu <guwen@linux.alibaba.com> >> --- >> net/smc/smc_loopback.c | 74 ++++++++++++++++++++++++++++++++++++++++++ >> net/smc/smc_loopback.h | 22 +++++++++++++ >> 2 files changed, 96 insertions(+) >> > > I've read the comments from Jiri and your answer. I can understand your thought. However, from the perspective of the > end user, it makes more sense to integetrate the stats info into 'smcd stats'. Otherwise, it would make users confused > to find out with which tool to check which statisic infornation. Sure, some improvement of the smc-tools is also needed Thank you Wenjia. Let's draw an analogy with RDMA devices, which is used in SMC-R. If we want to check the RNIC status or statistics, we may use rdma statistic command, or ibv_devinfo command, or check file under /sys/class/infiniband/mlx5_0. These provide details or attributes related to *devices*. Since s390 ISM can be used out of SMC, I guess it also has its own way (other than smc-tools) to check the statistic? What we can see in smcr stats or smcd stats command is about statistic or status of SMC *protocol* layer, such as DMB status, Tx/Rx, connections, fallbacks. If we put the underlying devices's statistics into smc-tools, should we also put RNIC statistics or s390 ISM statistics into smcr stat or smcd stat? and for each futures device that can be used by SMC-R/SMC-D, should we update them into smcr stat and smcd stat? And the attributes of each devices may be different, should we add entries in smcd stat for each of them? After considering the above things, I believe that the details of the underlying device should not be exposed to smc(smc-tools). What do you think? Thanks!
On 20.02.24 03:45, Wen Gu wrote: > > > On 2024/2/16 22:24, Wenjia Zhang wrote: >> >> >> On 11.01.24 13:00, Wen Gu wrote: >>> This introduces some statistics attributes of loopback-ism. They can be >>> read from /sys/devices/virtual/smc/loopback-ism/{xfer_tytes|dmbs_cnt}. >>> >>> Signed-off-by: Wen Gu <guwen@linux.alibaba.com> >>> --- >>> net/smc/smc_loopback.c | 74 ++++++++++++++++++++++++++++++++++++++++++ >>> net/smc/smc_loopback.h | 22 +++++++++++++ >>> 2 files changed, 96 insertions(+) >>> >> >> I've read the comments from Jiri and your answer. I can understand >> your thought. However, from the perspective of the end user, it makes >> more sense to integetrate the stats info into 'smcd stats'. Otherwise, >> it would make users confused to find out with which tool to check >> which statisic infornation. Sure, some improvement of the smc-tools is >> also needed > > Thank you Wenjia. > > Let's draw an analogy with RDMA devices, which is used in SMC-R. If we want > to check the RNIC status or statistics, we may use rdma statistic > command, or > ibv_devinfo command, or check file under /sys/class/infiniband/mlx5_0. > These > provide details or attributes related to *devices*. > > Since s390 ISM can be used out of SMC, I guess it also has its own way > (other > than smc-tools) to check the statistic? > > What we can see in smcr stats or smcd stats command is about statistic or > status of SMC *protocol* layer, such as DMB status, Tx/Rx, connections, > fallbacks. > > If we put the underlying devices's statistics into smc-tools, should we > also > put RNIC statistics or s390 ISM statistics into smcr stat or smcd stat? and > for each futures device that can be used by SMC-R/SMC-D, should we > update them > into smcr stat and smcd stat? And the attributes of each devices may be > different, > should we add entries in smcd stat for each of them? > > After considering the above things, I believe that the details of the > underlying > device should not be exposed to smc(smc-tools). What do you think? > > Thanks! > That is a very good point. It really depends on how we understand *devices* and how we want to use it. The more we are thinking, the more complicated the thing is getting. I'm trying to find accurate definitions on modeling virtual devices hoping that would make things eaiser. Unfortunately, it is not easy. Finally, I found this article: https://lwn.net/Articles/645810/ (Heads up! It is even from nine years ago, I'm not sure how reliable it is.) With the insight of this article, I'm trying to summarize my thought: It looks good to put the loopback-ism under the /sys/devices/virtual, especially according to the article " ... it is simply a place to put things that don't belong anywhere else. " However, in practice we use this in the term of simulated ism, which includes not only loopback-ism, but also other ones. Thus, does it not make sense to classify all of them together? E.g. same bus (just a half-baked idea) Then the following questions are comig up: - How should we organize them? - Should it show up in the smc_rnics? - How should it be seen from the perspective of the container? - If we see this loopback-ism as a *device*, should we not only put the device related information under the /sys? Thus, dmbs_cnt seems ok, but xfer_tytes not. Besides, we have a field in smd stat naming "Data transmitted (Bytes)", which should be suitable for this information.
On 2024/2/23 22:13, Wenjia Zhang wrote: > > > On 20.02.24 03:45, Wen Gu wrote: >> >> >> On 2024/2/16 22:24, Wenjia Zhang wrote: >>> >>> >>> On 11.01.24 13:00, Wen Gu wrote: >>>> This introduces some statistics attributes of loopback-ism. They can be >>>> read from /sys/devices/virtual/smc/loopback-ism/{xfer_tytes|dmbs_cnt}. >>>> >>>> Signed-off-by: Wen Gu <guwen@linux.alibaba.com> >>>> --- >>>> net/smc/smc_loopback.c | 74 ++++++++++++++++++++++++++++++++++++++++++ >>>> net/smc/smc_loopback.h | 22 +++++++++++++ >>>> 2 files changed, 96 insertions(+) >>>> >>> >>> I've read the comments from Jiri and your answer. I can understand your thought. However, from the perspective of the >>> end user, it makes more sense to integetrate the stats info into 'smcd stats'. Otherwise, it would make users >>> confused to find out with which tool to check which statisic infornation. Sure, some improvement of the smc-tools is >>> also needed >> >> Thank you Wenjia. >> >> Let's draw an analogy with RDMA devices, which is used in SMC-R. If we want >> to check the RNIC status or statistics, we may use rdma statistic command, or >> ibv_devinfo command, or check file under /sys/class/infiniband/mlx5_0. These >> provide details or attributes related to *devices*. >> >> Since s390 ISM can be used out of SMC, I guess it also has its own way (other >> than smc-tools) to check the statistic? >> >> What we can see in smcr stats or smcd stats command is about statistic or >> status of SMC *protocol* layer, such as DMB status, Tx/Rx, connections, fallbacks. >> >> If we put the underlying devices's statistics into smc-tools, should we also >> put RNIC statistics or s390 ISM statistics into smcr stat or smcd stat? and >> for each futures device that can be used by SMC-R/SMC-D, should we update them >> into smcr stat and smcd stat? And the attributes of each devices may be different, >> should we add entries in smcd stat for each of them? >> >> After considering the above things, I believe that the details of the underlying >> device should not be exposed to smc(smc-tools). What do you think? >> >> Thanks! >> > That is a very good point. It really depends on how we understand *devices* and how we want to use it. The more we are > thinking, the more complicated the thing is getting. I'm trying to find accurate definitions on modeling virtual devices > hoping that would make things eaiser. Unfortunately, it is not easy. Finally, I found this article: > https://lwn.net/Articles/645810/ (Heads up! It is even from nine years ago, I'm not sure how reliable it is.) With the > insight of this article, I'm trying to summarize my thought: > > It looks good to put the loopback-ism under the /sys/devices/virtual, especially according to the article > " > ... it is simply a place to put things that don't belong anywhere else. > " Yes, it can also be reflected from the implementation of get_device_parent(): static struct kobject *get_device_parent(struct device *dev, struct device *parent) { <...> /* * If we have no parent, we live in "virtual". * Class-devices with a non class-device as parent, live * in a "glue" directory to prevent namespace collisions. */ if (parent == NULL) parent_kobj = virtual_device_parent(dev); else if (parent->class && !dev->class->ns_type) { subsys_put(sp); return &parent->kobj; } else { parent_kobj = &parent->kobj; } <...> } > However, in practice we use this in the term of simulated ism, which includes not only loopback-ism, but also other > ones. Thus, does it not make sense to classify all of them together? E.g. same bus (just a half-baked idea) > > Then the following questions are comig up: > - How should we organize them? > - Should it show up in the smc_rnics? > - How should it be seen from the perspective of the container? > - If we see this loopback-ism as a *device*, should we not only put the device related information under the /sys? Thus, > dmbs_cnt seems ok, but xfer_tytes not. Besides, we have a field in smd stat naming "Data transmitted (Bytes)", which > should be suitable for this information. Actually I created 'smc' class under /sys/devices/virtual just to place loopback-ism, since it doesn't seem to belong to a certain class of device and serves only SMC. Other 'smc devices', e.g. RDMA device, s390 ISM and other Emulated-ISM like virtio-ism, all belong to a certain class or bus, so I have no intention of putting them under the same path. But now looks like that the 'smc' class and /sys/devices/virtual/smc path will lead people to mistakenly think that there is a class of 'SMC devices', but in fact these 'SMC devices' belongs to different classes or buses. They can be used by SMC and any other users. So I think it is better to avoid creating such 'smc' class. Alternatively, after referring to other examples in the kernel, I think another choice is to to put loopback-ism under /sys/devices/virtual/misc/, for devices which can't fit in a specific class. What do you think? Thanks a lot!
diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c index 3bf7bf5e8c96..a89dbf84aea5 100644 --- a/net/smc/smc_loopback.c +++ b/net/smc/smc_loopback.c @@ -30,6 +30,65 @@ static struct class *smc_class; static int smcd_lo_register_dev(struct smc_lo_dev *ldev); static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev); +static void smc_lo_clear_stats(struct smc_lo_dev *ldev) +{ + struct smc_lo_dev_stats64 *tmp; + int cpu; + + for_each_possible_cpu(cpu) { + tmp = per_cpu_ptr(ldev->stats, cpu); + tmp->xfer_bytes = 0; + } +} + +static void smc_lo_get_stats(struct smc_lo_dev *ldev, + struct smc_lo_dev_stats64 *stats) +{ + int size, cpu, i; + u64 *src, *sum; + + memset(stats, 0, sizeof(*stats)); + size = sizeof(*stats) / sizeof(u64); + for_each_possible_cpu(cpu) { + src = (u64 *)per_cpu_ptr(ldev->stats, cpu); + sum = (u64 *)stats; + for (i = 0; i < size; i++) + *(sum++) += *(src++); + } +} + +static ssize_t smc_lo_show_stats(struct device *dev, + struct device_attribute *attr, + char *buf, unsigned long offset) +{ + struct smc_lo_dev *ldev = + container_of(dev, struct smc_lo_dev, dev); + struct smc_lo_dev_stats64 stats; + ssize_t ret = -EINVAL; + + if (WARN_ON(offset > sizeof(struct smc_lo_dev_stats64) || + offset % sizeof(u64) != 0)) + goto out; + + smc_lo_get_stats(ldev, &stats); + ret = sysfs_emit(buf, "%llu\n", *(u64 *)(((u8 *)&stats) + offset)); +out: + return ret; +} + +/* generate a read-only statistics attribute */ +#define SMC_LO_DEVICE_ATTR_RO(name) \ +static ssize_t name##_show(struct device *dev, \ + struct device_attribute *attr, char *buf) \ +{ \ + return smc_lo_show_stats(dev, attr, buf, \ + offsetof(struct smc_lo_dev_stats64, name)); \ +} \ +static DEVICE_ATTR_RO(name) + +SMC_LO_DEVICE_ATTR_RO(xfer_bytes); +SMC_LO_DEVICE_ATTR_RO(dmbs_cnt); + static ssize_t active_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -67,6 +126,8 @@ static ssize_t active_store(struct device *dev, static DEVICE_ATTR_RW(active); static struct attribute *smc_lo_attrs[] = { &dev_attr_active.attr, + &dev_attr_xfer_bytes.attr, + &dev_attr_dmbs_cnt.attr, NULL, }; @@ -152,6 +213,7 @@ static int smc_lo_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb, } hash_add(ldev->dmb_ht, &dmb_node->list, dmb_node->token); write_unlock(&ldev->dmb_ht_lock); + SMC_LO_STAT_DMBS_INC(ldev); dmb->sba_idx = dmb_node->sba_idx; dmb->dmb_tok = dmb_node->token; @@ -191,6 +253,7 @@ static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb) clear_bit(dmb_node->sba_idx, ldev->sba_idx_mask); kfree(dmb_node->cpu_addr); kfree(dmb_node); + SMC_LO_STAT_DMBS_DEC(ldev); return 0; } @@ -249,6 +312,8 @@ static int smc_lo_move_data(struct smcd_dev *smcd, u64 dmb_tok, if (conn && !conn->killed) smcd_cdc_rx_handler(conn); + } else { + SMC_LO_STAT_XFER_BYTES(ldev, size); } return 0; } @@ -354,6 +419,7 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev) mutex_unlock(&smcd_dev_list.mutex); kfree(smcd->conn); kfree(smcd); + smc_lo_clear_stats(ldev); } static int smc_lo_dev_init(struct smc_lo_dev *ldev) @@ -374,6 +440,7 @@ static void smc_lo_dev_release(struct device *dev) struct smc_lo_dev *ldev = container_of(dev, struct smc_lo_dev, dev); + free_percpu(ldev->stats); kfree(ldev); } @@ -392,6 +459,13 @@ static int smc_lo_dev_probe(void) goto destroy_class; } + ldev->stats = alloc_percpu(struct smc_lo_dev_stats64); + if (!ldev->stats) { + ret = -ENOMEM; + kfree(ldev); + goto destroy_class; + } + ldev->dev.parent = NULL; ldev->dev.class = smc_class; ldev->dev.groups = smc_lo_attr_groups; diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h index 02a522e322b4..d4572ca42f08 100644 --- a/net/smc/smc_loopback.h +++ b/net/smc/smc_loopback.h @@ -32,16 +32,38 @@ struct smc_lo_dmb_node { dma_addr_t dma_addr; }; +struct smc_lo_dev_stats64 { + __u64 xfer_bytes; + __u64 dmbs_cnt; +}; + struct smc_lo_dev { struct smcd_dev *smcd; struct device dev; u8 active; u16 chid; struct smcd_gid local_gid; + struct smc_lo_dev_stats64 __percpu *stats; rwlock_t dmb_ht_lock; DECLARE_BITMAP(sba_idx_mask, SMC_LO_MAX_DMBS); DECLARE_HASHTABLE(dmb_ht, SMC_LO_DMBS_HASH_BITS); }; + +#define SMC_LO_STAT_SUB(ldev, key, val) \ +do { \ + struct smc_lo_dev_stats64 *_stats = (ldev)->stats; \ + this_cpu_add((*(_stats)).key, val); \ +} \ +while (0) + +#define SMC_LO_STAT_XFER_BYTES(ldev, val) \ + SMC_LO_STAT_SUB(ldev, xfer_bytes, val) + +#define SMC_LO_STAT_DMBS_INC(ldev) \ + SMC_LO_STAT_SUB(ldev, dmbs_cnt, 1) + +#define SMC_LO_STAT_DMBS_DEC(ldev) \ + SMC_LO_STAT_SUB(ldev, dmbs_cnt, -1) #endif int smc_loopback_init(void);
This introduces some statistics attributes of loopback-ism. They can be read from /sys/devices/virtual/smc/loopback-ism/{xfer_tytes|dmbs_cnt}. Signed-off-by: Wen Gu <guwen@linux.alibaba.com> --- net/smc/smc_loopback.c | 74 ++++++++++++++++++++++++++++++++++++++++++ net/smc/smc_loopback.h | 22 +++++++++++++ 2 files changed, 96 insertions(+)