diff mbox series

net/core/net-procfs: use seq_put_decimal_ull_width() for decimal values in /proc/net/dev

Message ID 20241110045221.4959-1-00107082@163.com (mailing list archive)
State Rejected
Delegated to: Netdev Maintainers
Headers show
Series net/core/net-procfs: use seq_put_decimal_ull_width() for decimal values in /proc/net/dev | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 1 maintainers not CCed: horms@kernel.org
netdev/build_clang success Errors and warnings before: 3 this patch: 3
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 4 this patch: 4
netdev/checkpatch warning WARNING: line length of 84 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns WARNING: line length of 92 exceeds 80 columns WARNING: line length of 93 exceeds 80 columns WARNING: line length of 97 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-11-11--21-00 (tests: 787)

Commit Message

David Wang Nov. 10, 2024, 4:52 a.m. UTC
seq_printf() is costy, when reading /proc/net/dev, profiling indicates
about 13% samples of seq_printf():
	dev_seq_show(98.350% 428046/435229)
	    dev_seq_printf_stats(99.777% 427092/428046)
		dev_get_stats(86.121% 367814/427092)
		    rtl8169_get_stats64(98.519% 362365/367814)
		    dev_fetch_sw_netstats(0.554% 2038/367814)
		    loopback_get_stats64(0.250% 919/367814)
		    dev_get_tstats64(0.077% 284/367814)
		    netdev_stats_to_stats64(0.051% 189/367814)
		    _find_next_bit(0.029% 106/367814)
		seq_printf(13.719% 58594/427092)
And on a system with one wireless interface, timing for 1 million rounds of
stress reading /proc/net/dev:
	real	0m51.828s
	user	0m0.225s
	sys	0m51.671s
On average, reading /proc/net/dev takes ~0.051ms

With this patch, extra costs parsing format string by seq_printf() can be
optimized out, and the timing for 1 million rounds of read is:
	real	0m49.127s
	user	0m0.295s
	sys	0m48.552s
On average, ~0.048ms reading /proc/net/dev, a ~6% improvement.

Even though dev_get_stats() takes up the majority of the reading process,
the improvement is still significant;
And the improvement may vary with the physical interface on the system.

Signed-off-by: David Wang <00107082@163.com>
---
 net/core/net-procfs.c | 37 ++++++++++++++++++++-----------------
 1 file changed, 20 insertions(+), 17 deletions(-)

Comments

Paolo Abeni Nov. 14, 2024, 9:17 a.m. UTC | #1
On 11/10/24 05:52, David Wang wrote:
> seq_printf() is costy, when reading /proc/net/dev, profiling indicates
> about 13% samples of seq_printf():
> 	dev_seq_show(98.350% 428046/435229)
> 	    dev_seq_printf_stats(99.777% 427092/428046)
> 		dev_get_stats(86.121% 367814/427092)
> 		    rtl8169_get_stats64(98.519% 362365/367814)
> 		    dev_fetch_sw_netstats(0.554% 2038/367814)
> 		    loopback_get_stats64(0.250% 919/367814)
> 		    dev_get_tstats64(0.077% 284/367814)
> 		    netdev_stats_to_stats64(0.051% 189/367814)
> 		    _find_next_bit(0.029% 106/367814)
> 		seq_printf(13.719% 58594/427092)
> And on a system with one wireless interface, timing for 1 million rounds of
> stress reading /proc/net/dev:
> 	real	0m51.828s
> 	user	0m0.225s
> 	sys	0m51.671s
> On average, reading /proc/net/dev takes ~0.051ms
> 
> With this patch, extra costs parsing format string by seq_printf() can be
> optimized out, and the timing for 1 million rounds of read is:
> 	real	0m49.127s
> 	user	0m0.295s
> 	sys	0m48.552s
> On average, ~0.048ms reading /proc/net/dev, a ~6% improvement.
> 
> Even though dev_get_stats() takes up the majority of the reading process,
> the improvement is still significant;
> And the improvement may vary with the physical interface on the system.
> 
> Signed-off-by: David Wang <00107082@163.com>

If the user-space is concerned with performances, it must use netlink.
Optimizing a legacy interface gives IMHO a very wrong message.

I'm sorry, I think we should not accept this change.

/P
David Wang Nov. 14, 2024, 9:56 a.m. UTC | #2
At 2024-11-14 17:17:32, "Paolo Abeni" <pabeni@redhat.com> wrote:
>
>
>On 11/10/24 05:52, David Wang wrote:
>> seq_printf() is costy, when reading /proc/net/dev, profiling indicates
>> about 13% samples of seq_printf():
>> 	dev_seq_show(98.350% 428046/435229)
>> 	    dev_seq_printf_stats(99.777% 427092/428046)
>> 		dev_get_stats(86.121% 367814/427092)
>> 		    rtl8169_get_stats64(98.519% 362365/367814)
>> 		    dev_fetch_sw_netstats(0.554% 2038/367814)
>> 		    loopback_get_stats64(0.250% 919/367814)
>> 		    dev_get_tstats64(0.077% 284/367814)
>> 		    netdev_stats_to_stats64(0.051% 189/367814)
>> 		    _find_next_bit(0.029% 106/367814)
>> 		seq_printf(13.719% 58594/427092)
>> And on a system with one wireless interface, timing for 1 million rounds of
>> stress reading /proc/net/dev:
>> 	real	0m51.828s
>> 	user	0m0.225s
>> 	sys	0m51.671s
>> On average, reading /proc/net/dev takes ~0.051ms
>> 
>> With this patch, extra costs parsing format string by seq_printf() can be
>> optimized out, and the timing for 1 million rounds of read is:
>> 	real	0m49.127s
>> 	user	0m0.295s
>> 	sys	0m48.552s
>> On average, ~0.048ms reading /proc/net/dev, a ~6% improvement.
>> 
>> Even though dev_get_stats() takes up the majority of the reading process,
>> the improvement is still significant;
>> And the improvement may vary with the physical interface on the system.
>> 
>> Signed-off-by: David Wang <00107082@163.com>
>
>If the user-space is concerned with performances, it must use netlink.
>Optimizing a legacy interface gives IMHO a very wrong message.
>
>I'm sorry, I think we should not accept this change.

It's OK. 
I have been using /proc/net/dev to gauge the transmit/receive rate for each interface,
 and /proc/net/netstat for abnormalities in my monitoring tools.  I guess my knowledge are quite out of date now,
I will look into netlink; And thanks for information.

>
>/P

Thanks
David
diff mbox series

Patch

diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
index fa6d3969734a..a0d6c5b32b58 100644
--- a/net/core/net-procfs.c
+++ b/net/core/net-procfs.c
@@ -46,23 +46,26 @@  static void dev_seq_printf_stats(struct seq_file *seq, struct net_device *dev)
 	struct rtnl_link_stats64 temp;
 	const struct rtnl_link_stats64 *stats = dev_get_stats(dev, &temp);
 
-	seq_printf(seq, "%6s: %7llu %7llu %4llu %4llu %4llu %5llu %10llu %9llu "
-		   "%8llu %7llu %4llu %4llu %4llu %5llu %7llu %10llu\n",
-		   dev->name, stats->rx_bytes, stats->rx_packets,
-		   stats->rx_errors,
-		   stats->rx_dropped + stats->rx_missed_errors,
-		   stats->rx_fifo_errors,
-		   stats->rx_length_errors + stats->rx_over_errors +
-		    stats->rx_crc_errors + stats->rx_frame_errors,
-		   stats->rx_compressed, stats->multicast,
-		   stats->tx_bytes, stats->tx_packets,
-		   stats->tx_errors, stats->tx_dropped,
-		   stats->tx_fifo_errors, stats->collisions,
-		   stats->tx_carrier_errors +
-		    stats->tx_aborted_errors +
-		    stats->tx_window_errors +
-		    stats->tx_heartbeat_errors,
-		   stats->tx_compressed);
+	seq_printf(seq, "%6s:", dev->name);
+	seq_put_decimal_ull_width(seq, " ", stats->rx_bytes, 7);
+	seq_put_decimal_ull_width(seq, " ", stats->rx_packets, 7);
+	seq_put_decimal_ull_width(seq, " ", stats->rx_errors, 4);
+	seq_put_decimal_ull_width(seq, " ", stats->rx_dropped + stats->rx_missed_errors, 4);
+	seq_put_decimal_ull_width(seq, " ", stats->rx_fifo_errors, 4);
+	seq_put_decimal_ull_width(seq, " ", stats->rx_length_errors + stats->rx_over_errors +
+				  stats->rx_crc_errors + stats->rx_frame_errors, 5);
+	seq_put_decimal_ull_width(seq, " ", stats->rx_compressed, 10);
+	seq_put_decimal_ull_width(seq, " ", stats->multicast, 9);
+	seq_put_decimal_ull_width(seq, " ", stats->tx_bytes, 8);
+	seq_put_decimal_ull_width(seq, " ", stats->tx_packets, 7);
+	seq_put_decimal_ull_width(seq, " ", stats->tx_errors, 4);
+	seq_put_decimal_ull_width(seq, " ", stats->tx_dropped, 4);
+	seq_put_decimal_ull_width(seq, " ", stats->tx_fifo_errors, 4);
+	seq_put_decimal_ull_width(seq, " ", stats->collisions, 5);
+	seq_put_decimal_ull_width(seq, " ", stats->tx_carrier_errors + stats->tx_aborted_errors +
+				  stats->tx_window_errors + stats->tx_heartbeat_errors, 7);
+	seq_put_decimal_ull_width(seq, " ", stats->tx_compressed, 10);
+	seq_putc(seq, '\n');
 }
 
 /*