Message ID | 20211120112850.46047-1-laoar.shao@gmail.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [v2] kthread: dynamically allocate memory to store kthread's full name | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Sat 2021-11-20 11:28:50, Yafang Shao wrote: > When I was implementing a new per-cpu kthread cfs_migration, I found the > comm of it "cfs_migration/%u" is truncated due to the limitation of > TASK_COMM_LEN. > > One possible way to fix this issue is extending the task comm size, but > as task->comm is used in lots of places, that may cause some potential > buffer overflows. Another more conservative approach is introducing a new > pointer to store kthread's full name if it is truncated, which won't > introduce too much overhead as it is in the non-critical path. Finally we > make a dicision to use the second approach. See also the discussions in > this thread: > https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/ > > After this change, the full name of these truncated kthreads will be > displayed via /proc/[pid]/comm: > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Looks good to me: Reviewed-by: Petr Mladek <pmladek@suse.com> Best Regards, Petr
On 20.11.21 12:28, Yafang Shao wrote: > When I was implementing a new per-cpu kthread cfs_migration, I found the > comm of it "cfs_migration/%u" is truncated due to the limitation of > TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19 are > all with the same name "cfs_migration/1", which will confuse the user. This > issue is not critical, because we can get the corresponding CPU from the > task's Cpus_allowed. But for kthreads correspoinding to other hardware > devices, it is not easy to get the detailed device info from task comm, > for example, > > jbd2/nvme0n1p2- > xfs-reclaim/sdf > > Currently there are so many truncated kthreads: > > rcu_tasks_kthre > rcu_tasks_rude_ > rcu_tasks_trace > poll_mpt3sas0_s > ext4-rsv-conver > xfs-reclaim/sd{a, b, c, ...} > xfs-blockgc/sd{a, b, c, ...} > xfs-inodegc/sd{a, b, c, ...} > audit_send_repl > ecryptfs-kthrea > vfio-irqfd-clea > jbd2/nvme0n1p2- > ... > > We can shorten these names to work around this problem, but it may be > not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-' for > example, it is a nice name, and it is not a good idea to shorten it. > > One possible way to fix this issue is extending the task comm size, but > as task->comm is used in lots of places, that may cause some potential > buffer overflows. Another more conservative approach is introducing a new > pointer to store kthread's full name if it is truncated, which won't > introduce too much overhead as it is in the non-critical path. Finally we > make a dicision to use the second approach. See also the discussions in > this thread: > https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/ > > After this change, the full name of these truncated kthreads will be > displayed via /proc/[pid]/comm: > > rcu_tasks_kthread > rcu_tasks_rude_kthread > rcu_tasks_trace_kthread > poll_mpt3sas0_statu > ext4-rsv-conversion > xfs-reclaim/sdf1 > xfs-blockgc/sdf1 > xfs-inodegc/sdf1 > audit_send_reply > ecryptfs-kthread > vfio-irqfd-cleanup > jbd2/nvme0n1p2-8 I do wonder if that could break some user space that assumes these names have maximum length .. But LGTM Reviewed-by: David Hildenbrand <david@redhat.com>
On Thu 2021-11-25 10:36:49, David Hildenbrand wrote: > On 20.11.21 12:28, Yafang Shao wrote: > > When I was implementing a new per-cpu kthread cfs_migration, I found the > > comm of it "cfs_migration/%u" is truncated due to the limitation of > > TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19 are > > all with the same name "cfs_migration/1", which will confuse the user. This > > issue is not critical, because we can get the corresponding CPU from the > > task's Cpus_allowed. But for kthreads correspoinding to other hardware > > devices, it is not easy to get the detailed device info from task comm, > > for example, > > > > jbd2/nvme0n1p2- > > xfs-reclaim/sdf > > > > Currently there are so many truncated kthreads: > > > > rcu_tasks_kthre > > rcu_tasks_rude_ > > rcu_tasks_trace > > poll_mpt3sas0_s > > ext4-rsv-conver > > xfs-reclaim/sd{a, b, c, ...} > > xfs-blockgc/sd{a, b, c, ...} > > xfs-inodegc/sd{a, b, c, ...} > > audit_send_repl > > ecryptfs-kthrea > > vfio-irqfd-clea > > jbd2/nvme0n1p2- > > ... > > > > We can shorten these names to work around this problem, but it may be > > not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-' for > > example, it is a nice name, and it is not a good idea to shorten it. > > > > One possible way to fix this issue is extending the task comm size, but > > as task->comm is used in lots of places, that may cause some potential > > buffer overflows. Another more conservative approach is introducing a new > > pointer to store kthread's full name if it is truncated, which won't > > introduce too much overhead as it is in the non-critical path. Finally we > > make a dicision to use the second approach. See also the discussions in > > this thread: > > https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/ > > > > After this change, the full name of these truncated kthreads will be > > displayed via /proc/[pid]/comm: > > > > rcu_tasks_kthread > > rcu_tasks_rude_kthread > > rcu_tasks_trace_kthread > > poll_mpt3sas0_statu > > ext4-rsv-conversion > > xfs-reclaim/sdf1 > > xfs-blockgc/sdf1 > > xfs-inodegc/sdf1 > > audit_send_reply > > ecryptfs-kthread > > vfio-irqfd-cleanup > > jbd2/nvme0n1p2-8 > > I do wonder if that could break some user space that assumes these names > have maximum length .. There is high chance that we will be on the safe side. Workqueue kthreads already provided longer names. They are even dynamic because the currently handled workqueue name is part of the name, see wq_worker_comm(). Best Regards, Petr
On 25.11.21 15:46, Petr Mladek wrote: > On Thu 2021-11-25 10:36:49, David Hildenbrand wrote: >> On 20.11.21 12:28, Yafang Shao wrote: >>> When I was implementing a new per-cpu kthread cfs_migration, I found the >>> comm of it "cfs_migration/%u" is truncated due to the limitation of >>> TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19 are >>> all with the same name "cfs_migration/1", which will confuse the user. This >>> issue is not critical, because we can get the corresponding CPU from the >>> task's Cpus_allowed. But for kthreads correspoinding to other hardware >>> devices, it is not easy to get the detailed device info from task comm, >>> for example, >>> >>> jbd2/nvme0n1p2- >>> xfs-reclaim/sdf >>> >>> Currently there are so many truncated kthreads: >>> >>> rcu_tasks_kthre >>> rcu_tasks_rude_ >>> rcu_tasks_trace >>> poll_mpt3sas0_s >>> ext4-rsv-conver >>> xfs-reclaim/sd{a, b, c, ...} >>> xfs-blockgc/sd{a, b, c, ...} >>> xfs-inodegc/sd{a, b, c, ...} >>> audit_send_repl >>> ecryptfs-kthrea >>> vfio-irqfd-clea >>> jbd2/nvme0n1p2- >>> ... >>> >>> We can shorten these names to work around this problem, but it may be >>> not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-' for >>> example, it is a nice name, and it is not a good idea to shorten it. >>> >>> One possible way to fix this issue is extending the task comm size, but >>> as task->comm is used in lots of places, that may cause some potential >>> buffer overflows. Another more conservative approach is introducing a new >>> pointer to store kthread's full name if it is truncated, which won't >>> introduce too much overhead as it is in the non-critical path. Finally we >>> make a dicision to use the second approach. See also the discussions in >>> this thread: >>> https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/ >>> >>> After this change, the full name of these truncated kthreads will be >>> displayed via /proc/[pid]/comm: >>> >>> rcu_tasks_kthread >>> rcu_tasks_rude_kthread >>> rcu_tasks_trace_kthread >>> poll_mpt3sas0_statu >>> ext4-rsv-conversion >>> xfs-reclaim/sdf1 >>> xfs-blockgc/sdf1 >>> xfs-inodegc/sdf1 >>> audit_send_reply >>> ecryptfs-kthread >>> vfio-irqfd-cleanup >>> jbd2/nvme0n1p2-8 >> >> I do wonder if that could break some user space that assumes these names >> have maximum length .. > > There is high chance that we will be on the safe side. Workqueue > kthreads already provided longer names. They are even dynamic > because the currently handled workqueue name is part of the name, > see wq_worker_comm(). Great, thanks!
diff --git a/fs/proc/array.c b/fs/proc/array.c index ff869a66b34e..4321aa63835d 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -92,6 +92,7 @@ #include <linux/string_helpers.h> #include <linux/user_namespace.h> #include <linux/fs_struct.h> +#include <linux/kthread.h> #include <asm/processor.h> #include "internal.h" @@ -102,6 +103,8 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape) if (p->flags & PF_WQ_WORKER) wq_worker_comm(tcomm, sizeof(tcomm), p); + else if (p->flags & PF_KTHREAD) + get_kthread_comm(tcomm, sizeof(tcomm), p); else __get_task_comm(tcomm, sizeof(tcomm), p); diff --git a/include/linux/kthread.h b/include/linux/kthread.h index 346b0f269161..2a5c04494663 100644 --- a/include/linux/kthread.h +++ b/include/linux/kthread.h @@ -33,6 +33,7 @@ struct task_struct *kthread_create_on_cpu(int (*threadfn)(void *data), unsigned int cpu, const char *namefmt); +void get_kthread_comm(char *buf, size_t buf_size, struct task_struct *tsk); void set_kthread_struct(struct task_struct *p); void kthread_set_per_cpu(struct task_struct *k, int cpu); diff --git a/kernel/kthread.c b/kernel/kthread.c index 7113003fab63..a70cd5dc94e3 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -60,6 +60,8 @@ struct kthread { #ifdef CONFIG_BLK_CGROUP struct cgroup_subsys_state *blkcg_css; #endif + /* To store the full name if task comm is truncated. */ + char *full_name; }; enum KTHREAD_BITS { @@ -93,6 +95,18 @@ static inline struct kthread *__to_kthread(struct task_struct *p) return kthread; } +void get_kthread_comm(char *buf, size_t buf_size, struct task_struct *tsk) +{ + struct kthread *kthread = to_kthread(tsk); + + if (!kthread || !kthread->full_name) { + __get_task_comm(buf, buf_size, tsk); + return; + } + + strscpy_pad(buf, kthread->full_name, buf_size); +} + void set_kthread_struct(struct task_struct *p) { struct kthread *kthread; @@ -118,9 +132,13 @@ void free_kthread_struct(struct task_struct *k) * or if kmalloc() in kthread() failed. */ kthread = to_kthread(k); + if (!kthread) + return; + #ifdef CONFIG_BLK_CGROUP - WARN_ON_ONCE(kthread && kthread->blkcg_css); + WARN_ON_ONCE(kthread->blkcg_css); #endif + kfree(kthread->full_name); kfree(kthread); } @@ -406,12 +424,22 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data), task = create->result; if (!IS_ERR(task)) { char name[TASK_COMM_LEN]; + va_list aq; + int len; /* * task is already visible to other tasks, so updating * COMM must be protected. */ - vsnprintf(name, sizeof(name), namefmt, args); + va_copy(aq, args); + len = vsnprintf(name, sizeof(name), namefmt, aq); + va_end(aq); + if (len >= TASK_COMM_LEN) { + struct kthread *kthread = to_kthread(task); + + /* leave it truncated when out of memory. */ + kthread->full_name = kvasprintf(GFP_KERNEL, namefmt, args); + } set_task_comm(task, name); } kfree(create);
When I was implementing a new per-cpu kthread cfs_migration, I found the comm of it "cfs_migration/%u" is truncated due to the limitation of TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19 are all with the same name "cfs_migration/1", which will confuse the user. This issue is not critical, because we can get the corresponding CPU from the task's Cpus_allowed. But for kthreads correspoinding to other hardware devices, it is not easy to get the detailed device info from task comm, for example, jbd2/nvme0n1p2- xfs-reclaim/sdf Currently there are so many truncated kthreads: rcu_tasks_kthre rcu_tasks_rude_ rcu_tasks_trace poll_mpt3sas0_s ext4-rsv-conver xfs-reclaim/sd{a, b, c, ...} xfs-blockgc/sd{a, b, c, ...} xfs-inodegc/sd{a, b, c, ...} audit_send_repl ecryptfs-kthrea vfio-irqfd-clea jbd2/nvme0n1p2- ... We can shorten these names to work around this problem, but it may be not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-' for example, it is a nice name, and it is not a good idea to shorten it. One possible way to fix this issue is extending the task comm size, but as task->comm is used in lots of places, that may cause some potential buffer overflows. Another more conservative approach is introducing a new pointer to store kthread's full name if it is truncated, which won't introduce too much overhead as it is in the non-critical path. Finally we make a dicision to use the second approach. See also the discussions in this thread: https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/ After this change, the full name of these truncated kthreads will be displayed via /proc/[pid]/comm: rcu_tasks_kthread rcu_tasks_rude_kthread rcu_tasks_trace_kthread poll_mpt3sas0_statu ext4-rsv-conversion xfs-reclaim/sdf1 xfs-blockgc/sdf1 xfs-inodegc/sdf1 audit_send_reply ecryptfs-kthread vfio-irqfd-cleanup jbd2/nvme0n1p2-8 Suggested-by: Petr Mladek <pmladek@suse.com> Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Hildenbrand <david@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Cc: Petr Mladek <pmladek@suse.com> --- Changes since v1: 1. leave it turncated when out of memory (Kees & Petr) 2. do null check in free_kthread_struct (Petr) --- fs/proc/array.c | 3 +++ include/linux/kthread.h | 1 + kernel/kthread.c | 32 ++++++++++++++++++++++++++++++-- 3 files changed, 34 insertions(+), 2 deletions(-)