Message ID | 20180112185812.7710-1-avagin@openvz.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 12 Jan 2018 10:58:11 -0800 Andrei Vagin <avagin@openvz.org> wrote: > seq_put_hex_ll() prints a number in hexadecimal notation and works > faster than seq_printf(). > > ... > > --- a/fs/seq_file.c > +++ b/fs/seq_file.c > @@ -670,6 +670,26 @@ void seq_puts(struct seq_file *m, const char *s) > } > EXPORT_SYMBOL(seq_puts); > > +static inline void seq_put_delimeter(struct seq_file *m, const char *delimiter) > +{ > + int len; > + > + if (!delimiter || !delimiter[0]) > + return; > + > + if (delimiter[1] == 0) > + return seq_putc(m, delimiter[0]); > + > + len = strlen(delimiter); > + if (m->count + len >= m->size) { > + seq_set_overflow(m); > + return; > + } > + > + memcpy(m->buf + m->count, delimiter, len); > + m->count += len; > +} Can we please have a nice comment describing this function's role and behaviour? I don't think the `inline' is needed or desirable - gcc can figure that out, and with three callsites a `noinline' would be more justified! That `return seq_putc(...)' will generate a warning in some situations - seq_putc() returns void. Let's split it into 'seq_putc(...); return;' please. > +/** > + * seq_put_hex_ll - put a number in hexadecimal notation > + * @m: seq_file identifying the buffer to which data should be written > + * @delimiter: a string which is printed before the number > + * @v: the number > + * @width: a minimum field width > + * > + * seq_put_hex_ll(m, "", v, 8) is equal to seq_printf(m, "0x08llx", v) > + * > + * This routine is very quick when you show lots of numbers. > + * In usual cases, it will be better to use seq_printf(). It's easier to read. > + */ > +void seq_put_hex_ll(struct seq_file *m, const char *delimiter, > + unsigned long long v, int width) > +{ > + int i, len; > + > + seq_put_delimeter(m, delimiter); > + > + len = (sizeof(v) * 8 - __builtin_clzll(v) + 3) / 4; > + > + if (unlikely(len == 0)) > + len = 1; > + > + if (len < width) > + len = width; > + > + if (m->count + len > m->size) > + goto overflow; > + > + for (i = len - 1; i >= 0; i--) { > + m->buf[m->count + i] = hex_asc[0xf & v]; > + v = v >> 4; > + } > + m->count += len; > + return; > +overflow: > + seq_set_overflow(m); > +} I don't think we need the goto. Just do "seq_set_overflow(m); return;".
On Fri, Jan 12, 2018 at 03:33:04PM -0800, Andrew Morton wrote: > On Fri, 12 Jan 2018 10:58:11 -0800 Andrei Vagin <avagin@openvz.org> wrote: > > > seq_put_hex_ll() prints a number in hexadecimal notation and works > > faster than seq_printf(). > > > > ... > > > > --- a/fs/seq_file.c > > +++ b/fs/seq_file.c > > @@ -670,6 +670,26 @@ void seq_puts(struct seq_file *m, const char *s) > > } > > EXPORT_SYMBOL(seq_puts); > > > > +static inline void seq_put_delimeter(struct seq_file *m, const char *delimiter) > > +{ > > + int len; > > + > > + if (!delimiter || !delimiter[0]) > > + return; > > + > > + if (delimiter[1] == 0) > > + return seq_putc(m, delimiter[0]); > > + > > + len = strlen(delimiter); > > + if (m->count + len >= m->size) { > > + seq_set_overflow(m); > > + return; > > + } > > + > > + memcpy(m->buf + m->count, delimiter, len); > > + m->count += len; > > +} > > Can we please have a nice comment describing this function's role and > behaviour? seq_put_decimal_* and seq_put_hex_ll prints a string before printing a number. Originaly it was just one symbol, it is probably a reason why it is called delimeter. I added an optimization for a case when delimiter is one symbol, and found that it sinificantly affect perfomance (about 13% for /proc/pid/maps): Without this optimization: [root@fc24 ~]# time python test.py real 0m9.105s user 0m2.200s sys 0m6.901s With this optimization: [root@fc24 ~]# time python test.py real 0m8.097s user 0m1.994s sys 0m6.102s If inline is replaced by noinline [root@fc24 ~]# time python test.py real 0m8.263s user 0m2.058s sys 0m6.200s [root@fc24 ~]# cat test.py #!/usr/bin/env python2 num = 0 with open("/proc/1/maps") as f: for x in xrange(100000): data = f.read() f.seek(0, 0) Andrew, thank you for the review, I will send a fixed patch soon. > > I don't think the `inline' is needed or desirable - gcc can figure that > out, and with three callsites a `noinline' would be more justified! > > That `return seq_putc(...)' will generate a warning in some situations > - seq_putc() returns void. Let's split it into 'seq_putc(...); > return;' please. > > > +/** > > + * seq_put_hex_ll - put a number in hexadecimal notation > > + * @m: seq_file identifying the buffer to which data should be written > > + * @delimiter: a string which is printed before the number > > + * @v: the number > > + * @width: a minimum field width > > + * > > + * seq_put_hex_ll(m, "", v, 8) is equal to seq_printf(m, "0x08llx", v) > > + * > > + * This routine is very quick when you show lots of numbers. > > + * In usual cases, it will be better to use seq_printf(). It's easier to read. > > + */ > > +void seq_put_hex_ll(struct seq_file *m, const char *delimiter, > > + unsigned long long v, int width) > > +{ > > + int i, len; > > + > > + seq_put_delimeter(m, delimiter); > > + > > + len = (sizeof(v) * 8 - __builtin_clzll(v) + 3) / 4; > > + > > + if (unlikely(len == 0)) > > + len = 1; > > + > > + if (len < width) > > + len = width; > > + > > + if (m->count + len > m->size) > > + goto overflow; > > + > > + for (i = len - 1; i >= 0; i--) { > > + m->buf[m->count + i] = hex_asc[0xf & v]; > > + v = v >> 4; > > + } > > + m->count += len; > > + return; > > +overflow: > > + seq_set_overflow(m); > > +} > > I don't think we need the goto. Just do "seq_set_overflow(m); return;". >
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 339e4c1c044d..3a08685ef27c 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -282,15 +282,18 @@ static void show_vma_header_prefix(struct seq_file *m, dev_t dev, unsigned long ino) { seq_setwidth(m, 25 + sizeof(void *) * 6 - 1); - seq_printf(m, "%08lx-%08lx %c%c%c%c %08llx %02x:%02x %lu ", - start, - end, - flags & VM_READ ? 'r' : '-', - flags & VM_WRITE ? 'w' : '-', - flags & VM_EXEC ? 'x' : '-', - flags & VM_MAYSHARE ? 's' : 'p', - pgoff, - MAJOR(dev), MINOR(dev), ino); + seq_put_hex_ll(m, NULL, start, 8); + seq_put_hex_ll(m, "-", end, 8); + seq_putc(m, ' '); + seq_putc(m, flags & VM_READ ? 'r' : '-'); + seq_putc(m, flags & VM_WRITE ? 'w' : '-'); + seq_putc(m, flags & VM_EXEC ? 'x' : '-'); + seq_putc(m, flags & VM_MAYSHARE ? 's' : 'p'); + seq_put_hex_ll(m, " ", pgoff, 8); + seq_put_hex_ll(m, " ", MAJOR(dev), 2); + seq_put_hex_ll(m, ":", MINOR(dev), 2); + seq_put_decimal_ull(m, " ", ino); + seq_putc(m, ' '); } static void diff --git a/fs/seq_file.c b/fs/seq_file.c index 4be761c1a03d..fb37ec42fae2 100644 --- a/fs/seq_file.c +++ b/fs/seq_file.c @@ -670,6 +670,26 @@ void seq_puts(struct seq_file *m, const char *s) } EXPORT_SYMBOL(seq_puts); +static inline void seq_put_delimeter(struct seq_file *m, const char *delimiter) +{ + int len; + + if (!delimiter || !delimiter[0]) + return; + + if (delimiter[1] == 0) + return seq_putc(m, delimiter[0]); + + len = strlen(delimiter); + if (m->count + len >= m->size) { + seq_set_overflow(m); + return; + } + + memcpy(m->buf + m->count, delimiter, len); + m->count += len; +} + /* * A helper routine for putting decimal numbers without rich format of printf(). * only 'unsigned long long' is supported. @@ -685,12 +705,7 @@ void seq_put_decimal_ull(struct seq_file *m, const char *delimiter, if (m->count + 2 >= m->size) /* we'll write 2 bytes at least */ goto overflow; - len = strlen(delimiter); - if (m->count + len >= m->size) - goto overflow; - - memcpy(m->buf + m->count, delimiter, len); - m->count += len; + seq_put_delimeter(m, delimiter); if (m->count + 1 >= m->size) goto overflow; @@ -712,6 +727,46 @@ void seq_put_decimal_ull(struct seq_file *m, const char *delimiter, } EXPORT_SYMBOL(seq_put_decimal_ull); +/** + * seq_put_hex_ll - put a number in hexadecimal notation + * @m: seq_file identifying the buffer to which data should be written + * @delimiter: a string which is printed before the number + * @v: the number + * @width: a minimum field width + * + * seq_put_hex_ll(m, "", v, 8) is equal to seq_printf(m, "0x08llx", v) + * + * This routine is very quick when you show lots of numbers. + * In usual cases, it will be better to use seq_printf(). It's easier to read. + */ +void seq_put_hex_ll(struct seq_file *m, const char *delimiter, + unsigned long long v, int width) +{ + int i, len; + + seq_put_delimeter(m, delimiter); + + len = (sizeof(v) * 8 - __builtin_clzll(v) + 3) / 4; + + if (unlikely(len == 0)) + len = 1; + + if (len < width) + len = width; + + if (m->count + len > m->size) + goto overflow; + + for (i = len - 1; i >= 0; i--) { + m->buf[m->count + i] = hex_asc[0xf & v]; + v = v >> 4; + } + m->count += len; + return; +overflow: + seq_set_overflow(m); +} + void seq_put_decimal_ll(struct seq_file *m, const char *delimiter, long long num) { int len; @@ -719,12 +774,7 @@ void seq_put_decimal_ll(struct seq_file *m, const char *delimiter, long long num if (m->count + 3 >= m->size) /* we'll write 2 bytes at least */ goto overflow; - len = strlen(delimiter); - if (m->count + len >= m->size) - goto overflow; - - memcpy(m->buf + m->count, delimiter, len); - m->count += len; + seq_put_delimeter(m, delimiter); if (m->count + 2 >= m->size) goto overflow; diff --git a/include/linux/seq_file.h b/include/linux/seq_file.h index 09c6e28746f9..53f238934d7f 100644 --- a/include/linux/seq_file.h +++ b/include/linux/seq_file.h @@ -121,6 +121,9 @@ void seq_puts(struct seq_file *m, const char *s); void seq_put_decimal_ull(struct seq_file *m, const char *delimiter, unsigned long long num); void seq_put_decimal_ll(struct seq_file *m, const char *delimiter, long long num); +void seq_put_hex_ll(struct seq_file *m, const char *delimiter, + unsigned long long v, int width); + void seq_escape(struct seq_file *m, const char *s, const char *esc); void seq_hex_dump(struct seq_file *m, const char *prefix_str, int prefix_type,
seq_put_hex_ll() prints a number in hexadecimal notation and works faster than seq_printf(). == test.py num = 0 with open("/proc/1/maps") as f: while num < 10000 : data = f.read() f.seek(0, 0) num = num + 1 == == Before patch == $ time python test.py real 0m1.561s user 0m0.257s sys 0m1.302s == After patch == $ time python test.py real 0m0.986s user 0m0.279s sys 0m0.707s $ perf -g record python test.py: == Before patch == - 67.42% 2.82% python [kernel.kallsyms] [k] show_map_vma.isra.22 - 64.60% show_map_vma.isra.22 - 44.98% seq_printf - seq_vprintf - vsnprintf + 14.85% number + 12.22% format_decode 5.56% memcpy_erms + 15.06% seq_path + 4.42% seq_pad + 2.45% __GI___libc_read == After patch == - 47.35% 3.38% python [kernel.kallsyms] [k] show_map_vma.isra.23 - 43.97% show_map_vma.isra.23 + 20.84% seq_path - 15.73% show_vma_header_prefix 10.55% seq_put_hex_ll + 2.65% seq_put_decimal_ull 0.95% seq_putc + 6.96% seq_pad + 2.94% __GI___libc_read Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrei Vagin <avagin@openvz.org> --- fs/proc/task_mmu.c | 21 ++++++++------ fs/seq_file.c | 74 ++++++++++++++++++++++++++++++++++++++++-------- include/linux/seq_file.h | 3 ++ 3 files changed, 77 insertions(+), 21 deletions(-)